Global ETD Search

61	Improving Performance of Biomedical Information Retrieval using Document-Level Field Boosting and BM25F Weighting Jervidalo, Jørgen January 2010 (has links) Corpora of biomedical information typically contains large amounts of ambiguous data, as proteins and genes can be referred to by a number of different terms, making information retrieval difficult. This thesis investigates a number of methods attempting to increase precision and recall of searches within the biomedical domain, including using the BM25F model for scoring documents and using Named Entity Recognition (NER) to identify biomedical entities in the text. We have implemented a prototype for testing the approaches, and have found that by using a combination of several methods, including using three different NER models at once, a significant increase (up to 11.5%) in mean average precision (MAP) is observed over our baseline result. ntnudaim:4443 MIT informatikk Informasjonsforvaltning
62	Å finne gammelt nytt : Bruk av NewsML i dagspressen / Finding old news : the use of NewsML in the daily press Ingeberg, Marit January 2006 (has links) Sentralt i hovedfagsoppgaven står det klassiske problemet; der innføring av ny teknologi krever utvikling av nye metoder og arbeidsrutiner, samtidig som de gamle rutinene får bestå tilnærmet uendret. Dette gjør at kompleksiteten i organisasjonenes systemer øker, da både gamle og nye metoder samt arbeidsrutiner må vedlikeholdes. Dette har skjedd hos Adresseavisen. Ved innføring av ny teknologi, og nye arbeidsmetoder, har man ikke hatt en helhetlig tenkning på arbeidsflyten og organisasjonsstrukturen. Dette ser man ved valg av løsning på analoge vs. digitale bilder, og på papiravis vs. nettutgaven av avisen.Konkret tar denne hovedoppgaven for seg gjenfinningen av dokumenter i dagsaviser, der hvordan NewsML-formatet kan brukes til dette formålet står sentralt. NewsML ble lansert av International Press and Telecommunications Council (IPTC), i samarbeid med nyhetsbyråene Reuters og AFP, i 2000. Målet med NewsML er å få et helhetlig format som dekker alle ledd i nyhetsobjektets livssyklus, der samme artikkel kan lagres med ulikt språk og med støtte for alle formater og visningsmedier.Informasjonsgjenfinning er viktig både for journalistene som skal lage en god avis, og for leserne som ønsker å søke etter artikler, for eksempel på internett. Gode søkesystemer kan være avgjørende om en avis skal greie å holde på en leser. I hovedoppgaven er søkeprosessen i seg selv beskrevet, i tillegg til ulike kriterier som et søkesystem kan vurderes etter. Som eksempel på hvordan systemet er i en dagsavis, beskrives Adresseavisens sitt system for lagring og gjenfinning av data. Adresseavisen skiller klart mellom ulike nyhetsobjekt som for eksempel tekst, lyd, bilder eller film. Disse ulike nyhetsobjektene til papiravisen lagres i ulike databaser, mens nyhetsobjekter til internettavisen lagres i en annen database. Disse ulike nyhetsobjektene beskrives med ulikt metadataformat. Dette gjør at når man ønsker å finne igjen alle delene av en nyhetsartikkel i Adresseavisen må man søke i ulike databaser som har ulike metadataformat. Formatet Adresseavisen bruker gjør at søkeprosessen kan bli noe tungvint. Med mindre personen som utfører søket har god kjennskap til oppbyggingen av databasene og metadataformata, risikerer man å gå glipp av verdifull informasjon.Det er laget mange ulike standarder og formater som skal støtte både gjenfinning, utveksling og lagring av nyheter. Noen av de mest brukte blir kort beskrevet, før NewsML beskrives i detalj. NewsML gir en elegant løsning på problemet Adresseavisen har, med ulike databasser og ulike metadataformater, ved å definere alle nyhetsobjektene som et element uansett hva slags medietype det er av og hva det skal brukes til. Et NewsML-dokument inneholder en kompleks XML struktur. I denne strukturen blir en hele artikkel representert ved hjelp av et element som innholder metadata som er felles for hele nyhetsartikkelen. Inne i dette elementet kan man ha ulike nyhetskomponenter, som igjen kan innholde nye nyhetskomponenter, eller innholdselementer. De ulike nyhetskomponentene kan for eksempel skille mellom samme artikkel på ulike språk. NewsML-standarden har enkelte obligatoriske metadatafelt, men er likevel åpen nok til at ulike mediekonsern med ulike interesser og fokus kan tilpasse et NewsML-system slik at det passer til deres behov. For å illustrere praktisk hvordan NewsML kan bli brukt i Adresseavisen er det laget konverteringstabeller mellom ulike metadataposter i Adresseavisen, opp i mot et NewsML-dokument. Videre er det implementert en prototyp som benytter NewsML, for å vise fordeler og utfordringer ved bruk av NewsML som format. Prototypen implementerer alle NewsML-elementene. Grensesnittet i prototypen viser utfordringene med den komplekse XML-strukturen som NewsML inneholder. Adresseavisen sitt system har en stadig økende kompleksitet, og bør på sikt endres. Ved en slik endring har Adresseavisen flere valg. Et alternativ er å ta i bruk prinsippet NewsML har med å lagre ulike nyhetselementer som like objekter. Disse objektene kan lagres ved hjelp av relasjonsdatabase, slik Adresseavisen har det i dag. Dette vil gjøre at avisen får et enhetlig og oversiktlig system der alle nyhetsobjekter har felles metadataformat, men systemet mister NewsML sin fordel ved at man kan lagre ulike versjoner av en artikkel med samme metadata. En annen løsning er å innføre NewsML med sin komplekse struktur. Man trenger ikke nødvendigvis implementere alle NewsML-elementene med engang, men forenkle strukturen noe og heller utvide ved behov. ntnudaim:3266 MIT informatikk Informasjonsforvaltning
63	Improving Performance of Biomedical Retrieval using Expectation Maximization Bjerkan, Alexander Borgerud January 2011 (has links) I denne oppgaven skal du implementere et statistisk metode for å forbedre et søkeresultat. Du skal bruke Expetcation-Maximization (EM) metode til formålet. Oppgaven går ut på å finne ut hvor nyttig EM til å resortere et initielt søkeresultat slik at de mest relevante treffene kommer først. Du skal da sammenlikne metoden din med andre metoder som allerede er brukt til samme formål.Rent konkret skal du:Implementere EM i Java.Implementere et søkesystem basert på Open Source søkebibliotek som feks. LuceneIntegrere EM i søkesystemet.Evaluere systemet basert på kvaliteten på søkeresultatene. Her skal du bruke standard testkolleksjon med fasit til evalueringen.Du skal sammenlikne resulatet ditt med eksistrene metode(r)Teste ytelsen til EM-algoritmen avhengig av størrelsen på indexen. ntnudaim:6121 MIT informatikk Informasjonsforvaltning
64	Improving Search in Social Media Images with External Information Oftedal, Mathilde Ødegård, Sæther, Marte Johansen January 2014 (has links) The use of social media has increased considerably the recent years, andusers share a lot of their daily life in social media. Many of the users uploadimages to photo-sharing applications, and categorize their images withtextual tags. Users do not always use the best tags to describe the images,but add tags to get "likes" or use tags as a status update. For this reason,searching on tags are unpredictable, and does not necessary return the resultthe user expected.This thesis studies the impact of expanding queries in image searches withterms from knowledge bases, such as DBpedia. We study the methodsTF-IDF, Mutual Information and Chi-square to nd related candidates forquery expansion. The thesis reports on how we implemented and appliedthese methods in a query expansion setting. Our experiments show thatChi-square is the method that yields the best result with the best averageprecision, and was slightly better than a search without query expansion.TF-IDF gave the second best result with query expansion, and Mutual informationwas the method that gave the worst average precision. Queryexpansion with related terms is an exiting eld, and the information fromthis thesis gives a good indication that this is a eld that should be moreexplored in the future. ntnudaim:9776 MIT informatikk Informasjonsforvaltning
65	Efficient Algorithms for Video Segmentation Kosmo, Vegard Andre January 2006 (has links) <p>Describing video content without watching the entire video is a challenging matter. Textual descriptions are usually inaccurate and ambiguous, and if the amount of video is large this manual task is almost endless. If the textual description is replaced with pictures from the video, this is a much more adequate method. The main challenge will then involve which pictures to pick to make sure the entire video content is covered by the description. TV stations with an appurtenant video archive would prefer to have an effective and automated method to perform this task, with focus on holding the time consumption to an absolute minimum and simultaneously get the output results as precise as possible compared with the actual video content. In this thesis, three different methods for automatic shot detection in video files have been designed and tested. The goal was to build a picture storyline from input video files, where this storyline contained one picture from each shot in the video. This task should be done in a minimum of time. Since video files actually are one long series of consecutive pictures, various image properties have been used to detect the video shots. The final evaluation has been done based both on output quality and overall time consumption. The test results show that the best method detected video shots with an average accuracy of approximately 90%, and with an overall time consumption of 8.8% of the actual video length. Combined with some additional functionality, these results may be further improved. With the solutions designed and implemented in this thesis, it is possible to detect shots in any video file, and create a picture storyline to describe the video content. Possible areas of application are TV stations and private individuals that have a need to describe a collection of video files in an effective and automated way.</p> ntnudaim SIF2 datateknikk Data- og informasjonsforvaltning
66	PORDaS : Peer-to-peer Object Relational Database System Eide, Eirik, Standal, Odin Hole January 2006 (has links) <p>This master thesis presents PORDaS, the Peer-to-peer Object Relational Database System. It is a continuation of work done in a project of fall 2005, where the foundation for the thesis was laid down. The focus of the work is on distributed query processing between autonomous databases in a structured peer-to-peer network. A great deal of effort has gone into compiling the theoretical foundation for the project, which served as a basis for assessing alternative approaches to introducing a query processor in a peer-to-peer database. The old PORDaS version was extended to include a simplified, pipelined query processor capable of joining tables. The query processor had two different execution strategies, the first was performing join operators at the requesting node and the second was performing join operators parallel among the nodes participating in the query. Experiments which ran PORDaS on a cluster of 36 computers showed that there are room for improvements even though the system was able to perform all the tests.</p> ntnudaim SIF2 datateknikk Data- og informasjonsforvaltning
67	SAM Engine : Model-based Framework for Scalability Assessment Holmefjord, Anders Johan January 2006 (has links) <p>Today's way of life includes increasing amounts of information, and therefore handling and processing of information. Almost everything you do involves some sort of a computer somewhere, and many businesses have implemented comprehensive computer systems into their corporative structure, to serve both employees and customers. But if a new service is introduced to the users, or a new group of users are introduced to an existing service, how do you know if the performance will be satisfying? To deal with such questions, a method called The Scalability Assessment Method (SAM) has been developed. The Scalability Assessment Method is a general procedure for evaluating the scalability of a system architecture. Other projects have applied SAM to real reference systems, and their results have shown that SAM is a method that can be trusted to give credible predictions. Until recently, dedicated software tools that support the SAM method have been absent, and the researchers have been using i.a. spreadsheets in an ad hoc approach to the problems. Therefore, a SAM software package is in development. The SAM Engine (SAMe) is a Java program developed in this project, with an intuitive user interface that is enabling a non-expert user to apply the method on a desired architecture. This report documents the development of the prototype SAM Engine (SAMe), and how the program supports the SAM method. Keywords: Performance evaluation, scalability, simulation, Structure and Performance, SAM.</p> ntnudaim SIF2 datateknikk Data- og informasjonsforvaltning
68	Perspectives to Ad-hoc Extensions of Cellular Networks Schjønhaug, Andreas January 2006 (has links) <p>With more and more cellular phone handsets being introduced wifi interfaces, an alternative to using the purely cell-based mobile phone networks infrastructure can be envisioned: mobile phone cells can be extended via ad-hoc relaying of traffic from out-of-coverage handsets towards in-coverage handsets, and mobile phones might be able to establish communication paths between them without passing through the cellular infrastructure. This presents a number of challenges including billing, security, multi-fabric routing, hand-over and quality of service. One task of this project is thus to look at the necessity of providing brokering network access, addressing variables influencing its success. The other task for this project is to construct the necessary mechanisms for establishing the ad-hoc part of an extension to cellular networks. This involves analysing ad-hoc networking and implementing OLSRv2, one of several experimental solutions for ad-hoc networking that exist within the Internet Engineering Task Force (IETF). To emulate such a system, but also for debugging and evaluating the implementation, a test-bed is required to be constructed.</p> ntnudaim SIF2 datateknikk Data- og informasjonsforvaltning
69	Geographical Location of Internet Hosts using a Multi-Agent System Thorvaldsen, Øystein Espelid January 2006 (has links) <p>This thesis focuses on a part of Internet forensics concerned with determining the geographic location of Internet hosts, also known as geolocation. Several techniques to geolocation exist. A classification of these techniques, and a comparative analysis of their properties is conducted. Based on this analysis several novel improvements to current techniques are suggested. As part of an earlier designed Multi-Agent Framework for Internet Forensics (MAFIF), an application implementing two active- measurement geolocation techniques is designed, implemented and tested. Experiments with the application are performed in the Uninett network, with the goal of identifying the impact of different network properties on geolocation. What most clearly set this thesis apart from earlier work, in addition to the use of a multi-agent system, is the analysis of the impact of IPv6 on geolocation, and the introduction of multi-party computation to geolocation. The extensive focus on delay measurements, although not bringing anything new to the field of networking in general, is also new to geolocation as far as we know.</p> ntnudaim SIF2 datateknikk Data- og informasjonsforvaltning
70	Industriell IT: Arkitektur for integrasjon og bruk av prosessinformasjon til sporing og verdikjedestyring / Industrial IT: Architecture for integration og and use of Process Information for tracking&trace and Valuechain management Vevle, Geir January 2006 (has links) <p>Sporing har i seneste tid for alvor blitt satt på dagsordenen. Med minnene om kugalskap og skrapesyke nesten glemt, ble E.coli og salmonella viktige mediebegivenheter. Næringsmiddelindustrien og myndighetene har på grunn av dette blitt oppmerksomme på problemet og Landbruks- og Matministeren, Terje Riis-Johansen har lansert prosjektet eSporing, som har som mål å ha en teknisk løsning for sporing på plass innen 2010. Slik som informasjonssystemene er bygget opp i de fleste tilfeller i næringsmiddelindustrien, med et ERP system med tråder helt ned i enkelte prosessenheter, er dette tungt å få til på en god måte. Samtidig har det vist seg at generell sporing, kun er lønnsomt ved en tilbaketrekking, altså ved en krise. Ellers koster det bare mer penger og arbeid å ha kontroll på næringsmidlene. Det er imidlertid flere måter som sporing kan benyttes til noe positivt. Dette kan være verdikjedestyring, innhenting av nøkkelparametre som kan brukes til prosessoptimalisering og varedifferensiering. Problemet er imidlertid at dette krever en bedre sporing enn det som har vist seg å være mulig når sporingen blir løst på sentralt nivå. Det er altså behov for egne systemer på fabrikknivå for å forbedre sporingen. Dette krever en arkitektur som legger til rette for disse systemene. Datainnsamling og besøk på ulike fabrikker, har gjort det mulig å finne krav som en slik arkitektur har behov for å løse. Kravene til sporing og verdikjedestyring har vært sentrale. Med sporing menes både sporing på prosessinformasjon og sporing av hvilke innsatsfaktorer som har gått inn i en sporbar enhet. Prosessinformasjon er her definert som prosessenhetshendelser som kun har betydning lokalt på en enhet, mens innsatsfaktorer er definert som alle enheter som følger med produktet videre i næringskjeden. Ut fra disse kravene har en konseptuell arkitektur blitt laget. Denne anbefaler bruk av tilsvarende lagdeling som ISA 95 standarden. Nettverksarkitektur og informasjonsarkitektur er også definert her. Den tekniske arkitekturen er ikke definert fordi det har vært et mål å ha arkitekturen uavhengig av teknologier. Den overordnede datamodellen som er beskrevet er laget for å sikre støtte for prosessinformasjon fra både lastbærere og prosessutstyr. Den har også støtte for den sporingen frem og tilbake i verdikjeden ved at alle innsatsfaktorer blir registrert og følger den sporbare enheten. Hvordan arkitekturen og et eget lag med informasjonssystemer best kan støtte økt sporing, økt automatisering og gi en fleksibel og fremtidsrettet platform å bygge vider på, er diskutert i denne oppgaven. Resultatet av oppgaven er en generell arkitektur som kan støtte sporing og verdikjedestyring på en best mulig måte.</p> ntnudaim SIF2 datateknikk Data- og informasjonsforvaltning

Search results