Global ETD Search

1	Bibliotekariestudenter och söksträngsexpansion : Ett experiment om manuell söksträngsexpansion / LIS students and query expansion : An experiment on manual query expansion Karlsson, Kristoffer January 2012 (has links) The focus of the thesis is an information search technique called manual query expansion, which aims to improve the retrieval per-formance in a search system by adding terms to the search query. The study looks at the query expansions done by LIS students, and the eventual beneficial effects by using a search help during the search process. The empirical data have been collected thro-ugh an experiment, with an experiment group as well as a control group. The conclusion is that the students need more education in query expansion in order to be able to use it correctly. There are significant differences between the groups in several aspects, for example in the use of search terms, parentheses, and truncation. The study should be seen as a pilot study and a proposal for further research. / Program: Bibliotekarie Söksträngsexpansion Informationssökning Informationsåtervinning Social Sciences Samhällsvetenskap
2	Att sjunga en fråga. En jämförelse av tre Query-by-Humming-system och deras användare. / To sing a question. A comparison of three Query-by-Humming systems and their different users. Eriksson, Madeleine January 2012 (has links) The aim of this study was to compare the Query-by-Humming systems Midomi, Musicline and Tunebot regarding their retrieval effectiveness. The aim was to see if there were differences between the systems but also between the user groups common users, musicians and singers. Query-by-Humming system means that the user sings a tune that the system then use to find the right melody.To compare the systems and their users, queries where collected from the different user groups and replayed for the systems. Mean Reciprocal Rank and Friedman test was used to do the comparison.The results showed that the system did not achieve equivalent and that there were no difference between the user groups. The Mean Reciprocal Rank showed that the systems had very different retrieval effectiveness, where Midomi was the system with best result and Musicline with the lowest result. / Program: Bibliotekarie Query-by-Humming Music Information Retrieval Informationsåtervinning Återvinnginseffektivitet Social Sciences Samhällsvetenskap
3	Att maskinöversätta sökfrågor : En studie av Google Translate och Bing Translators förmåga att översätta svenska sammansättningar i ett CLIR-perspektiv / Machine translation of queries : A study of the ability of Google Translate and Bing Translator to translate Swedish compounds in a CLIR perspective Qureshi, Karl January 2016 (has links) Syftet med denna uppsats är att undersöka hur väl Google Translate respektive Bing Translator fungerar vid översättning av sökfrågor med avseende på svenska sammansättningar, samt försöka utröna huruvida det finns något samband mellan utfallet och sammansättningarnas komplexitet. Testmiljön utgörs av Europaparlamentets offentliga dokumentregister. Undersökningen är emellertid begränsad till Europeiska rådets handlingar, som till antalet är 1 334 på svenska respektive 1 368 på engelska. Analysen av data har dels skett utifrån precision och återvinningsgrad, dels utifrån en kontrastiv analys för att kunna ge en mer enhetlig bild på det undersökta fenomenet. Resultatet visar att medelvärdet varierar mellan 0,287 och 0,506 för precision samt 0,400 och 0,614 för återvinningsgrad beroende på ordtyp och översättningstjänst. Vidare visar resultatet att det inte tycks finnas något tydligt samband mellan effektivitet och sammansättningarnas komplexitet. I stället tycks de lägre värdena bero på synonymi, och då gärna inom själva sammansättningen, samt hyponymi. I det senare fallet beror det dels på översättningstjänsternas oförmåga att återge lämpliga översättningar, dels på det engelska språkets tendens att bilda sammansättningar med lösa substantivattribut. CLIR cross-language information retrieval informationsåtervinning sammansättning statistisk maskinöversättning
4	En tesaurus som ledsagare : En jämförande studie av tre sökstrategiers inverkan på återvinningsresultatet i en bibliografisk databas. / The thesaurus as a companion : A comparative study of three search strategies and their influence on information retrieval results in a bibliographic database. Hagberg, Lena, Müntzing, Johanna January 2006 (has links) This Master’s thesis is a comparative study of information retrieval results between three distinct search strategies in simulated automatic query expansion in a bibliographic database. Our purpose is to investigate which of the search strategies score the most effective precision and to what extent the same relevant documents are retrieved (overlapped). A thesaurus attached to the database is used to select appropriate descriptors for the baseline query formulations which subsequently are expanded with hierarchical relations. The search strategies are s1: A baseline query with two or three descriptors, s2: The baseline descriptors combined with at least one Narrower Term, s3: The baseline descriptors combined with Narrower Term and at least one Broader Term. A Document Cutoff Value of 15 is used and only the 15 highest ranked documents are judged by relevancy. The measurements used are precision for effectiveness and Jaccard’s index for overlap. In terms of precision, results reveal that s1 scores the highest value (average 84,8 %) with s2 and s3 in decreasing order (average 81,94 % and 61,41 % respectively). The overlap varies greatly depending on topic and the average is between s1 and s2 78,81 %, between s2 and s3 58,48 % and between s3 and s1 40,41 %. In short, average precision decreases as well as average overlap. The use of thesaurus in the applied strategy of automatic query expansion is not recommended in this specific database, if the aim is to increase precision. However, in single searches with the structure like s1 the thesaurus can be of assistance in the selection of specific search terms. / Uppsatsnivå: D information retrieval query expansion tesaurus bibliografisk databas informationsåtervinning kontrollerad vokabulär Social Sciences Samhällsvetenskap
5	Cross-language information retrieval : sökfrågestruktur & sökfrågeexpansion / Cross-language information retrieval : query structure & query expansion Nyman, Marie, Patja, Maria January 2008 (has links) This Master’s thesis examines different retrieval strategies used in cross-language information retrieval (CLIR). The aim was to investigate if there were any differences between baseline queries and translated queries in retrieval effectiveness; how the retrieval effectiveness was affected by query structuring and if the results differed between different languages. The languages used in this study were Swedish, English and Finnish. 30 topics from the TrecUta collection were translated into Swedish and Finnish. Baseline queries in Swedish and Finnish were made and translated into English using a dictionary and thereby simulating automatic translation. The queries were expanded by adding all the translations from the main entries to the queries. Two kinds of queries – structured and unstructured – were designed. The queries were fed into the InQuery IR system which presented a list of retrieved documents where the relevant ones were marked. The performance of the queries was analysed by Query Performance Analyser (QPA). Average precision at seen relevant documents at DCV 10, average precision at DCV 10 and precision and recall at DCV 200 were used to measure the retrieval effectiveness. Despite the morphological differences between Swedish and Finnish, none or very small differences in retrieval performance were found, except when average precision at DCV 10 was used. The baseline queries performed the best results and the structured queries performed better in both Swedish and Finnish than the unstructured queries. The results are consistent with previous research. / Uppsatsnivå: D CLIR cross-language information retrieval flerspråkig informationsåtervinning sökfrågeexpansion återvinningseffektivitet effektivitetsmått sökfrågestruktur sökfrågestrukturering Social Sciences Samhällsvetenskap
6	Automatisk genreklassifikation : en experimentell studie / Automatic genre classification : an experimental study Nolgren, Markus January 2008 (has links) This thesis aims at examining to what extent a few, algorithmically very easily extractable document features can be used to classify electronic documents according to genre. A set of experiments is therefore carried out, using only 11 such simple features in an attempt to classify 84 documents belonging to electronic academic journals into three manually identified genres: table of contents, article, and review. The 11 features are also divided into three sets, containing metrics of words and sentences; punctuation marks; and URL links, respectively. The performance when using these sets of features is then measured with regard to classification accuracy, using a k-NN classifier, four different values of k (1, 3, 5, 7), and both leave-one-out and 10-fold cross-validation. Best results are achieved when using all three feature sets (i.e. all 11 features) and k=3, with an overall accuracy of 96% (81 of the 84 documents correctly classified), regardless of method for cross-validation. These results are significantly better than those of a referential baseline, conceived as the case where all instances would be guessed as belonging to the most populated class, with a corresponding accuracy of 49%. While not considered as disappointing in any way, the results are viewed by the author as perhaps an expression of a somewhat easy classification task. He therefore concludes by advocating further research on the capability of very simple features in contributing to accurate automatic genre classification, preferably by the use of experimental settings that are better suited to shed light on this matter. / Uppsatsnivå: D automatisk genreklassifikation genre dokumentgenre automatisk klassifikation informationsåtervinning ir-system maskininlärning Social Sciences Samhällsvetenskap
7	Hur söker användare kinesiskt material i LIBRIS : 找得到吗? / How does users search for Chinese language material in LIBRIS : 找得到吗? Svanström, Erik January 2012 (has links) The purpose with this thesis is to research what function the National Library Catalogue of Sweden (LIBRIS) has as an information resource for Chinese language material. How is LIBRIS used as an information resource seen from the point of view of a users specific language need. This will be exemplified through an empirical survey of three user groups with different prerequisites concerning their knowledge in Chinese. Group A were beginners and had studied Chinese for one year. Group B were advanced students who were studying their third year of Chinese. Group C consisted of students with Chinese as their mother tongue. How does Swedish students who study Chinese and Chinese students, both at Lund University, search Chinese language material in LIBRIS?I gave the participants a list with eight titles of Chinese books written in Chinese characters and asked them to retrieve them in LIBRIS. Their sessions at the computer were recorded with the screen capture program iShowU. In that way I was able to study their navigation and searches very thorough. After each search session I interviewed them regarding their thoughts and feelings about the search process.My empirical study shows that there is a clear need to revise and improve the retrieval function of Chinese language material and the display of Chinese characters in the result lists and within bibliographic records.My method is based on empirical research through a set of qualitative observations and semi structured interviews. / Program: Bibliotekarie kinesiska nationella regler katalogisering informationsåtervinning LIBRIS icke-latinsk skrift Social Sciences Samhällsvetenskap
8	Indexering av humanistisk litteratur och humanistiska databaser : exemplet MLA International Bibliography / Indexing in the humanities and humanities databases : a study of the MLA International Bibliography Lundtoft, Linn January 2000 (has links) This thesis aims to answer the following questions: What type of vocabulary is used in the humanities? What does this implicate when it comes to indexing in the humanities? How do humanities scholars seek information and what type of information do they need? What consequences does this have when it comes to the development of humanities databases? How is the MLA International Bibliography organized? How does its indexing system – CIFT – work? How does the MLA International Bibliography correspond with the needs of the humanities scholar? The study shows that the type of vocabulary used in the humanities differs significantly from that used in the sciences. Therefore, information retrieval is often said to be problematic in the humanities when compared to the sciences. This is, however, not completely true. If consideration is taken to the different type of vocabulary used in the humanities, subject access proves to be more straightforward than has been generally recognized. The study also shows that humanities scholars have slightly other information needs than scientists. For example, they use different types of publications in their research. On a whole, the MLA International Bibliography corresponds well with the different needs of humanities scholars. The vocabulary used in its thesaurus is based on the type of vocabulary actually used by them. The database also includes references to various publication types. Even so, the MLA International Bibliography could do with a few improvements in order to enhance its value to the research community. The inclusion of abstracts, more references to internationally published material and improved currency when it comes to time lapse between primary source and online version are areas that needs to be improved. / Uppsatsnivå: D humaniora databaser databaser informationsåtervinning informationsbehov indexering mla international bibliography Social Sciences Samhällsvetenskap
9	Cross-Language Information Retrieval : En studie av lingvistiska problem och utvecklade översättningsmetoder för lösningar angående informationsåtervinning över språkliga gränser. Boström, Anna January 2004 (has links) <p>Syftet med denna uppsats är att undersöka problem samt lösningar i relation till informationsåtervinning över språkliga gränser. Metoden som har använts i uppsatsen är studier av forskningsmaterial inom lingvistik samt främst den relativt nya forskningsdisciplinen Cross-Language Information Retrieval (CLIR). I uppsatsen hävdas att världens alla olikartade språk i dagsläget måste betraktas som ett angeläget problem för informationsvetenskapen, ty språkliga skillnader utgör ännu ett stort hinder för den internationella informationsåtervinning som tekniska framsteg, uppkomsten av Internet, digitala bibliotek, globalisering, samt stora politiska förändringar i ett flertal länder runtom i världen under de senaste åren tekniskt och teoretiskt sett har möjliggjort. I uppsatsens första del redogörs för några universellt erkända lingvistiska skillnader mellan olika språk – i detta fall främst med exempel från europeiska språk – och vanliga problem som dessa kan bidra till angående översättningar från ett språk till ett annat. I uppsatsen hävdas att dessa skillnader och problem även måste anses som relevanta när det gäller informationsåtervinning över språkliga gränser. Uppsatsen fortskrider med att ta upp ämnet Cross-Language Information Retrieval (CLIR), inom vilken lösningar på flerspråkighet och språkskillnader inom informationsåtervinning försöker utvecklas och förbättras. Målet med CLIR är att en informationssökare så småningom skall kunna söka information på sitt modersmål men ändå hitta relevant information på flera andra språk. Ett ytterligare mål är att den återfunna informationen i sin helhet även skall kunna översättas till ett för sökaren önskat språk. Fyra olika översättningsmetoder som i dagsläget finns utvecklade inom CLIR för att automatiskt kunna översätta sökfrågor, ämnesord, eller, i vissa fall, hela dokument åt en informationssökare med lite eller ingen alls kunskap om det språk som han eller hon söker information på behandlas därefter. De fyra metoderna – identifierade som maskinöversättning, tesaurus- och ordboksöversättning, korpusbaserad översättning, samt ingen översättning – diskuteras även i relation till de lingvistiska problem och skillnader som har tagits upp i uppsatsens första del. Resultatet visar att språk är någonting mycket komplext och att de olika metoderna som hittills finns utvecklade ofta kan lösa något eller några av de uppmärksammade lingvistiska översättningssvårigheterna. Dock finns det inte någon utvecklad metod som i dagsläget kan lösa samtliga problem. Uppsatsen uppmärksammar emellertid även att CLIR-forskarna i hög grad är medvetna om de nuvarande metodernas uppenbara begränsningar och att man prövar att lösa detta genom att försöka kombinera flera olika översättningsmetoder i ett CLIR-system. Avslutningsvis redogörs även för CLIR-forskarnas förväntningar och förhoppningar inför framtiden.</p> / <p>This essay deals with information retrieval across languages by examining different types of literature in the research areas of linguistics and multilingual information retrieval. The essay argues that the many different languages that co-exist around the globe must be recognised as an essential obstacle for information science. The language barrier today remains a major impediment for the expansion of international information retrieval otherwise made technically and theoretically possible over the last few years by new technical developments, the Internet, digital libraries, globalisation, and moreover many political changes in several countries around the world. The first part of the essay explores linguistic differences and difficulties related to general translations from one language to another, using examples from mainly European languages. It is suggested that these problems and differences also must be acknowledged and regarded as highly important when it comes to information retrieval across languages. The essay continues by reporting on Cross-Language Information Retrieval (CLIR), a relatively new research area where methods for multilingual information retrieval are studied and developed. The object of CLIR is that people in the future shall be able to search for information in their native tongue, but still find relevant information in more than one language. Another goal for the future is the possibility to translate complete documents into a person’s language of preference. The essay reports on four different CLIR-methods currently established for automatically translating queries, subject headings, or, in some cases, complete documents, and thus aid people with little or no knowledge of the language in which he or she is looking for information. The four methods – identified as machine translation, translations using a multilingual thesaurus or a manually produced machine readable dictionary, corpus-based translation, and no translation – are discussed in relation to the linguistic translation difficulties mentioned in the paper’s initial part. The conclusion drawn is that language is exceedingly complex and that while the different CLIR-methods currently developed often can solve one or two of the acknowledged linguistic difficulties, none is able to overcome all. The essay also show, however, that CLIR-scientists are highly aware of the limitations of the different translation methods and that many are trying to get to terms with this by incorporating several sources of translation in one single CLIR-system. The essay finally concludes by looking at CLIR-scientists’ expectations and hopes for the future.</p> Cross-Language Information Retrieval CLIR multilingual information retrieval linguistics Multilingual Information Access Cross-Language Information Retrieval CLIR flerspråkig informationsåtervinning språkvetenskap Multilingual Information Access Library and information science Biblioteks- och informationsvetenskap
10	Cross-language information retrieval : en studie av lingvistiska problem och utvecklade översättningsmetoder för lösningar angående informationsåtervinning över språkliga gränser Boström, Anna January 2004 (has links) Syftet med denna uppsats är att undersöka problem samt lösningar i relation till informationsåtervinning över språkliga gränser. Metoden som har använts i uppsatsen är studier av forskningsmaterial inom lingvistik samt främst den relativt nya forskningsdisciplinen Cross-Language Information Retrieval (CLIR). I uppsatsen hävdas att världens alla olikartade språk i dagsläget måste betraktas som ett angeläget problem för informationsvetenskapen, ty språkliga skillnader utgör ännu ett stort hinder för den internationella informationsåtervinning som tekniska framsteg, uppkomsten av Internet, digitala bibliotek, globalisering, samt stora politiska förändringar i ett flertal länder runtom i världen under de senaste åren tekniskt och teoretiskt sett har möjliggjort. I uppsatsens första del redogörs för några universellt erkända lingvistiska skillnader mellan olika språk – i detta fall främst med exempel från europeiska språk – och vanliga problem som dessa kan bidra till angående översättningar från ett språk till ett annat. I uppsatsen hävdas att dessa skillnader och problem även måste anses som relevanta när det gäller informationsåtervinning över språkliga gränser. Uppsatsen fortskrider med att ta upp ämnet Cross-Language Information Retrieval (CLIR), inom vilken lösningar på flerspråkighet och språkskillnader inom informationsåtervinning försöker utvecklas och förbättras. Målet med CLIR är att en informationssökare så småningom skall kunna söka information på sitt modersmål men ändå hitta relevant information på flera andra språk. Ett ytterligare mål är att den återfunna informationen i sin helhet även skall kunna översättas till ett för sökaren önskat språk. Fyra olika översättningsmetoder som i dagsläget finns utvecklade inom CLIR för att automatiskt kunna översätta sökfrågor, ämnesord, eller, i vissa fall, hela dokument åt en informationssökare med lite eller ingen alls kunskap om det språk som han eller hon söker information på behandlas därefter. De fyra metoderna – identifierade som maskinöversättning, tesaurus- och ordboksöversättning, korpusbaserad översättning, samt ingen översättning – diskuteras även i relation till de lingvistiska problem och skillnader som har tagits upp i uppsatsens första del. Resultatet visar att språk är någonting mycket komplext och att de olika metoderna som hittills finns utvecklade ofta kan lösa något eller några av de uppmärksammade lingvistiska översättningssvårigheterna. Dock finns det inte någon utvecklad metod som i dagsläget kan lösa samtliga problem. Uppsatsen uppmärksammar emellertid även att CLIR-forskarna i hög grad är medvetna om de nuvarande metodernas uppenbara begränsningar och att man prövar att lösa detta genom att försöka kombinera flera olika översättningsmetoder i ett CLIR-system. Avslutningsvis redogörs även för CLIR-forskarnas förväntningar och förhoppningar inför framtiden. / This essay deals with information retrieval across languages by examining different types of literature in the research areas of linguistics and multilingual information retrieval. The essay argues that the many different languages that co-exist around the globe must be recognised as an essential obstacle for information science. The language barrier today remains a major impediment for the expansion of international information retrieval otherwise made technically and theoretically possible over the last few years by new technical developments, the Internet, digital libraries, globalisation, and moreover many political changes in several countries around the world. The first part of the essay explores linguistic differences and difficulties related to general translations from one language to another, using examples from mainly European languages. It is suggested that these problems and differences also must be acknowledged and regarded as highly important when it comes to information retrieval across languages. The essay continues by reporting on Cross-Language Information Retrieval (CLIR), a relatively new research area where methods for multilingual information retrieval are studied and developed. The object of CLIR is that people in the future shall be able to search for information in their native tongue, but still find relevant information in more than one language. Another goal for the future is the possibility to translate complete documents into a person’s language of preference. The essay reports on four different CLIR-methods currently established for automatically translating queries, subject headings, or, in some cases, complete documents, and thus aid people with little or no knowledge of the language in which he or she is looking for information. The four methods – identified as machine translation, translations using a multilingual thesaurus or a manually produced machine readable dictionary, corpus-based translation, and no translation – are discussed in relation to the linguistic translation difficulties mentioned in the paper’s initial part. The conclusion drawn is that language is exceedingly complex and that while the different CLIR-methods currently developed often can solve one or two of the acknowledged linguistic difficulties, none is able to overcome all. The essay also show, however, that CLIR-scientists are highly aware of the limitations of the different translation methods and that many are trying to get to terms with this by incorporating several sources of translation in one single CLIR-system. The essay finally concludes by looking at CLIR-scientists’ expectations and hopes for the future. Cross-Language Information Retrieval CLIR multilingual information retrieval linguistics Multilingual Information Access Cross-Language Information Retrieval CLIR flerspråkig informationsåtervinning språkvetenskap Multilingual Information Access Library and information science Biblioteks- och informationsvetenskap

Search results