Spelling suggestions: "subject:"felmatchning"" "subject:"matchning""
1 |
Automatiserad matchning av relaterad data från olika datakällor / Automated matching of related data from different data sourcesHarch, Gais, Ullström, Robin January 2014 (has links)
Sociala medier innehåller idag massor av information som kan bidra till att ge applikationer och produkter ett stort mervärde genom att ge en förbättrad användarupplevelse. I vissa fall kan sådan information inte erhållas utan att först matcha data från en eller flera datakällor genom en data fusion. Eniro Initiatives AB vill undersöka möjligheter för att genomföra en automatiserad data fusion genom att koppla företag från sitt API till motsvarande företag på sociala medier. Problematiken ligger i att den enda helt säkra källan till matchning av alla svenska företag är dess organisationsnummer, vilket är data som inte finns tillgänglig hos API:er från utländska företag. Syftet var att undersöka möjligheter för att på automatiserat sätt kunna matcha relaterad data från olika datakällor. I detta examensarbete har en prototyp utvecklats som matchar företag från Eniros API med företags sidor från Facebooks API. Resultatet från tester av denna prototyp visar dock brister, då det uppkom fall där redundant information bidrog till att prototypen kunde godkänna inofficiella sidor med koppling till det relevanta företaget, vilket inte var önskvärt. / Social media today contains a lot of information that can add a great value for applications and products by achieve an improved user experience. In some cases, such information cannot be obtained without matching data from one or several data sources through a data fusion. Eniro Initiatives AB wants to explore opportunities to implement an automated data fusion model by matching companies from its own API to the corresponding company on social media. The problem is that the only completely secured data of matching of all Swedish companies is its corporate identity, which is data that is not available with APIs that origin from foreign companies. The aim was to explore possibilities for the automated way to match related data from different data sources. In this thesis, a prototype was developed to match companies from Eniro’s API with company pages from Facebook's API. The results from the tests of this prototype shows small deficiencies where redundant information made the prototype able to approve unofficial pages with links to the relevant company, which was not desirable.
|
2 |
An Extension of The Berry-Ravindran Algorithm for protein and DNA dataRiekkola, Jesper January 2022 (has links)
String matching algorithms are the algorithms used to search through different types of text in search of a certain pattern. Many of these algorithms achieve their impressive performance by analysing the pattern and saving that information. That information is then continuously used during the searching phase to know what parts of the text can be skipped. One such algorithm is the Berry-Ravindran. The Berry-Ravindran checks the two characters past the current try for a match and sees if those characters exist in the pattern. This thesis compares the Berry-Ravindran algorithm to new versions of itself that check three and four characters instead of two, along with the Boyer-Moore algorithm. Checking more characters improves the amount of the text that can be skipped by reducing the number of attempts needed but exponentially increases the pre-processing time. The improved performance in attempts does not necessarily mean a faster run-time because of the increased pre-processing time. The variable impacting the pre-processing time the biggest is the size of the alphabet that the text uses. This is researched by testing these algorithms with patterns ranging from 4 to 100 characters long on two different data sets. Protein data which has an alphabet size of 27 and DNA data which has an alphabet size of 4.
|
Page generated in 0.0781 seconds