Global ETD Search

1	Aplikace pro získávání názorů z uživatelských recenzí Švec, Jiří January 2017 (has links) This paper describes some current methods for customer's opinion mining and text processing. Application architecture is created along with database structure. Based on this architecture, new application is implemented. Emphasis is placed on automated downloading and processing of customers' reviews with opinion extraction and user-friendly presentation. Part of this paper is dedicated to optimizing the application for better efficiency and speed boost.
2	TIKTOK FORENSIC SCRAPER TO RETRIEVE USER VIDEO DETAILS Akshata Nirmal Thole (14221547) 06 December 2022 (has links) <p>TIKTOK FORENSIC SCRAPER TO RETRIEVE USER VIDEO DETAILS.</p> <p><br></p> <p>Thesis - Akshata Thole </p> <p><br></p> Digital forensics TikTok Web Scraping Digital Forensics OSINT Security
3	Ladok Browser Extension : An Evaluation of Browser Extension API:s Rahman, Mukti Flora January 2022 (has links) Syftet med denna studie har varit att undersöka ifall det är möjligt att utveckla ett användargränssnitt i form av ett webbläsartillägg för Ladok som är ett resultatsystem för universitet och högskolor i Sverige. En del av studien har också varit att kunna utvärdera minst ett sätt att skapa webbläsartillägg. Enkätundersökningar samt intervjuer gjordes för att kunna förstå vilka typer av funktioner som skulle kunna vara till nytta för studenter samt lärare i ett sådant användargränssnitt. Det gjordes även GUI prototyper i designverktyget Figma som det gjordes användartester på. Den största utmaningen under arbetet har varit att kunna dra en slutsats om det är möjligt att kunna få tillgång till data från Ladok med hjälp av webbskrapning och API-förfrågningar. Datat på Ladok är sekretessbelagt eftersom Ladok innehåller konfidentiell information. Det har därför varit svårt att få tillgång till data under projektets gång. Olika typer av metoder testades under projektets gång för att se om det skulle kunna gå att få tillgång till data för att kunna utveckla ett användargränssnitt för Ladok. Slutsatsen som kan dras för detta projekt är det krävs mer forskning och tid samt att det inte finns någon lösning på detta än. Framtida arbete som är värt att nämna är kunna implementera användarskript som endast körs när studenter är inloggade på Ladok. Ett exempel på ett verktyg som kan användas för detta ändamål är TamperMonkey som är kompatibelt med Google Chrome. GreaseMonkey är motsvarar TamperMonkey, men är kompatibelt med Mozilla Firefox. / The objective of this study has been to examine if it is possible to develop a user interface as a browser extension for Ladok which is a result system that is used by higher education institutions such as colleges and universities in Sweden. A part of the study has also been to be able to evaluate at least one method of developing browser extensions. Interviews and surveys were conducted in order to understand what types of functions that would be beneficial for both students and teachers in such a user interface. GUI mockups were created in the design tool Figma and were later measured through usability tests. The main challenge during the study has been to be able to determine if it is possible to access data from Ladok through web scraping and API requests. As Ladok consists of confidential information about students, the data is private. Due to this it has been very difficult to be able to gather data. Different types of methods and approaches were used in order to determine if it would be possible to develop a user interface for Ladok. The conclusion that can be drawn is that more research and time are needed and that there is no clear solution for this yet. Future work could be to develop user scripts that would only run when Ladok would be used. An example of a tool for user scripts is TamperMonkey, which is compatible with Google Chrome. GreaseMonkey is equivalent to TamperMonkey, but is compatible with Mozilla Firefox. Browser extension Web Scraping API JavaScript TamperMonkey GreaseMonke Browser extension Web Scraping API JavaScript TamperMonkey GreaseMonke Software Engineering Programvaruteknik
4	En jämförelse av prestanda mellan centraliserad och decentraliserad datainsamling Hidén, Filip, Qvarnström, Magnus January 2021 (has links) In the modern world, data and information is used on a larger scale than ever before. Much of this information is stored on the internet in many different shapes, like articles, files and webpages, among others. If you try to start a new project or company that depends on this data there is a need for a way to efficiently search for, sort and gather what you need to process. A common method to achieve this is called Web scraping, that can be implemented in several different ways to search and gather data. This can be an expensive investment for smaller companies, as Web scraping is an intensive process that requires that you pay for a powerful enough server to manage everything. The purpose of this report is to investigate whether there exist other cheaper alternatives to implement Web scraping, that don’t require access to expensive servers. To find an answer to this, it was necessary to research the subject of Web scraping further along with different system architectures that are used in the industry to implement it. This research was then used to develop a Web scraping application that was implemented on both a centralised server and as a decentralised implementation on an Android device. Finally all the summarized research and results from performance tests of the two applications were used in order to provide a result. The conclusion drawn from these results was that decentralised android implementations is a valid and functional solution for Web scraping today, however the difference in performance means it’s not always useful for every situation. Instead it must be handled based on the specifications and requirements of the particular company. There is also a very limited amount of research done on this topic, which means it needs further investigation in order to keep developing implementations and knowledge on this particular subject. / I den moderna världen används data och information i en större skala än någonsin tidigare. Mycket av denna information och data kan hittas på internet i många olika former som artiklar, filer, webbsidor med mera. Om man försöker att starta ett nytt projekt eller företag som är beroende av delar av denna data behövs det ett sätt att effektivt söka igenom den, sortera ut det som söks och samla in den för att hanteras. Ett vanligt sätt att göra detta är en metod som kallas Web scraping, som kan implementeras på flera olika sätt för att söka och samla in den funna datan. För små företag kan detta bli en kostsam satsning, då Web scraping är en intensiv process som vanligtvis kräver att man måste betala för att driva en tillräckligt kraftfull server som kan hantera datan. Syftet med denna rapport är att undersöka om det finns giltiga och billigare alternativ för att implementera Web scraping lösningar, som inte kräver tillgång till kostsamma serverlösningar. För att svara på detta utfördes en undersökning runt Web scraping, samt olika systemarkitekturer som används för att utveckla dessa system i den nuvarande marknaden samt hur de kan implementeras. Med denna kunskap utvecklades en Web scraping applikation som anpassades för att samla in ingredienser från recept artiklar på internet. Denna implementation anpassades sedan för två olika lösningar, en centraliserad på en server och en decentraliserad, för Android enheter. Till slut summerades all den insamlade faktan, tillsammans med enhetstester utförda på test implementationerna för att få ut ett resultat. Slutsatsen som drogs av detta resultat var att decentraliserade Android implementationer är en giltig och funktionell lösning för Web scraping idag, men skillnaden i prestanda innebär att det inte alltid är en användbar lösning, istället måste det bestämmas beroende på ett företags behov och specifikationer. Dessutom är forskningen runt detta ämne begränsat, och kräver vidare undersökning och fördjupning för att förbättra kunskaper och implementationer av detta område i framtiden. Kandidat examensarbete Webscraping datainsamling centralierad Web scraping decentraliserad Web scraping Andoid mobila enheter Computer and Information Sciences Data- och informationsvetenskap
5	Alternative Information Gathering on Mobile Devices Jakupovic, Edin January 2017 (has links) Searching and gathering information about specific topics is a time wasting, but vital practise. With the continuous growth and surpassing of desktop devices, the mobile market is becoming a more important area to consider. Due to the portability of mobile devices, certain tasks are more difficult to perform, compared to on a desktop device. Searching for information online is generally slower on mobile devices than on desktop devices, even though the majority of searches are performed on mobile devices. The largest challenges with searching for information online using mobile devices, are the smaller screen sizes, and the time spent jumping between sources and search results in a browser. These challenges could be solved by using an application that focuses on the relevancy of search results, summarizes the content of them, and presents them on a single screen. The aim of this study was to find an alternative data gathering method with a faster and simpler searching experience. This data gathering method was able to quickly find and gather data requested through a search term by a user. The data was then analyzed and presented to the user in a summarized form, to eliminate the need to visit the source of the content. A survey was performed by having a smaller target group of users answer a questionnaire. The results showed that the method was quick, results were often relevant, and the summaries reduced the need to visit the source page. But while the method had potential for future development, it is hindered by ethical issues related to the use of web scrapers. / Sökning och insamling av information om specifika ämnen är en tidskrävande, men nödvändig praxis. Med den kontinuerliga tillväxten som gått förbi stationära enheters andel, blir mobilmarknaden ett viktigt område att överväga. Med tanke på rörligheten av bärbara enheter, så blir vissa uppgifter svårare att utföra, jämfört med på stationära enheter. Att söka efter information på Internet är generellt långsammare på mobila enheter än på stationära. De största utmaningarna med att söka efter information på Internet med mobila enheter, är de mindre skärmstorlekarna, och tiden spenderad på att ta sig mellan källor och sökresultat i en webbläsare. Dessa utmaningar kan lösas genom att använda en applikation som fokuserar på relevanta sökresultat och sammanfattar innehållet av dem, samt presenterar dem på en enda vy. Syftet med denna studie är att hitta en alternativ datainsamlingsmetod för attskapa en snabbare och enklare sökupplevelse. Denna datainsamlingsmetod kommer snabbt att kunna hitta och samla in data som begärts via en sökterm av en användare. Därefter analyseras och presenteras data för användaren i en sammanfattad form för att eliminera behovet av att besöka innehållets källa. En undersökning utfördes genom att en mindre målgrupp av användare svarade på ett formulär av frågor. Resultaten visade att metoden var snabb, resultaten var ofta relevanta och sammanfattningarna minskade behovet av att besöka källsidan. Men medan metoden hade potential för framtida utveckling, hindras det av de etiska problemen som associeras med användningen av web scrapers. Data collection Mobile devices Web scraping Summarization methods User-centered design Datainsamling Mobila enheter Web scraping Textsammanfattningsmetoder Användarcentrerad design Computer and Information Sciences Data- och informationsvetenskap
6	Automatizovaná extracia informácií z internetu / Automated web information extraction Smotrila, Tomáš January 2011 (has links) 1 Web sites offer a huge amount of information. Often it is a page generated from data stored in databases. However, emphasis is placed on the display of information, but not on their machine processing. Part of the thesis is design and implementation of a prototype system to retrieve data from dynamically generated web using programming by demonstration technique. Such a system allows the user to show with mouse to the system how to proceed with gathering information from the website. Based on such a example, the system will derive a procedure to acquire information on similar sites. The implemented system is able to collect user relevant information from similar sites for example in form of a simple table suitable for further machine processing.
7	Interaktivní procházení webu a extrakce dat / Interactive web crawling and data extraction Fejfar, Petr January 2018 (has links) Title: Interactive crawling and data extraction Author: Bc. Petr Fejfar Author's e-mail address: pfejfar@gmail.com Department: Department of Distributed and Dependable Systems Supervisor: Mgr. Pavel Je ek, Ph.D., Department of Distributed and De- pendable Systems Abstract: The subject of this thesis is Web crawling and data extraction from Rich Internet Applications (RIA). The thesis starts with analysis of modern Web pages along with techniques used for crawling and data extraction. Based on this analysis, we designed a tool which crawls RIAs according to the instructions defined by the user via graphic interface. In contrast with other currently popular tools for RIAs, our solution is targeted at users with no programming experience, including business and analyst users. The designed solution itself is implemented in form of RIA, using the Web- Driver protocol to automate multiple browsers according to user-defined instructions. Our tool allows the user to inspect browser sessions by dis- playing pages that are being crawled simultaneously. This feature enables the user to troubleshoot the crawlers. The outcome of this thesis is a fully design and implemented tool enabling business user to extract data from the RIAs. This opens new opportunities for this type of user to collect data from Web pages for use...
8	Skrapa försäljningssidor på nätet : Ett ramverk för webskrapningsrobotar Karlsson, Emil, Edberg, Mikael January 2016 (has links) På internet finns det idag ett stort utbud av försäljningswebbsidor där det hela tiden inkommer nya annonser. Vi ser att det finns ett behov av ett verktyg som övervakar de här webbsidorna dygnet runt för att se hur mycket som säljs och vad som säljs. Att skapa ett program som övervakar webbsidor är tidskrävande, därför har vi skapat ett ramverk som underlättar skapandet av webbskrapare som är fokuserade på att listbaserade försäljningswebbsidor på nätet. Det finns flera olika ramverk för webbskrapning, men det finns väldigt få som endast är fokuserade på den här typen av webbsidor. Web scraping Web crawling Framework Listbased sales websites. Computer Sciences Datavetenskap (datalogi)
9	Automated Discovery of Real-Time Network Camera Data from Heterogeneous Web Pages Ryan Merrill Dailey (8086355) 14 January 2021 (has links) <div>Reduction in the cost of Network Cameras along with a rise in connectivity enables entities all around the world to deploy vast arrays of camera networks. Network cameras offer real-time visual data that can be used for studying traffic patterns, emergency response, security, and other applications. Although many sources of Network Camera data are available, collecting the data remains difficult due to variations in programming interface and website structures. Previous solutions rely on manually parsing the target website, taking many hours to complete. We create a general and automated solution for indexing Network Camera data spread across thousands of uniquely structured webpages. We analyze heterogeneous webpage structures and identify common characteristics among 73 sample Network Camera websites (each website has multiple web pages). These characteristics are then used to build an automated camera discovery module that crawls and indexes Network Camera data. Our system successfully extracts 57,364 Network Cameras from 237,257 unique web pages. </div> Computer Engineering Pattern Recognition and Data Mining Web scraping Network Cameras Web Content Differentiation Automated Data Aggregation
10	Platforma pro sběr kryptoměnových adres / Platform for Cryptocurrency Address Collection Bambuch, Vladislav January 2020 (has links) Cílem této práce je vytvořit platformu pro sběr a zobrazování metadat o kryptoměnových adresách z veřejného i temného webu. K dosažení tohoto cíle jsem použil technologie zpracování webu napsané v PHP. Komplikace doprovázející automatické zpracování webových stránek byly vyřešeny techonologí Apache Kafka a jejími schopnosti škálování procesů. Modularita platformy byla dosažena pomocí architektury microservices a Docker containerization. Práce umožňuje jedinečný způsob, jak hledat potenciální kriminální aktivity, které se odehrály mimo rámec blockchain, pomocí webové aplikace pro správu platformy a vyhledávání v extrahovaných datech. Vytvořená platforma zjednodušuje přidávání nových, na sobě nezávislých modulů, kde Apache Kafka zprostředkovává komunikaci mezi nimi. Výsledek této práce může být použit pro detekci a prevenci kybernetické kriminality. Uživatelé tohoto systému mohou být orgány činné v trestním řízení nebo ostatní činitelé a uživatelé, zajímající se o reputaci a kreditibilitu kryptoměnových adres.

Search results