• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 4
  • 1
  • Tagged with
  • 5
  • 5
  • 5
  • 5
  • 5
  • 3
  • 3
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 1
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Specialization of an Existing Image Recognition Service Using a Neural Network

Ersson, Sara, Dahl, Oskar January 2018 (has links)
To help combat the environmental impacts caused by humans this project is about investigating one way to simplify the waste management process. The idea is to use image recognition to identify what material the recyclable object is made of. A large data set containing labeled images of trash, called Trashnet, was analyzed using Google Cloud Vision. Since this API is not written for material detection specifically, a feed forward neural network was created using Tensorflow and trained with the output from Google Cloud Vision. Thus, the network learned how different word combinations from Google Cloud Vision implicated one of five different materials; glass, plastic, paper, metal and combustible waste. The network checked for 518 unique words in the input and ran them through two hidden layers with a size of 1000 nodes each, before having a one hot output layer. This neural network received an accuracy of around 60%, which beat Google Cloud Vision’s meager accuracy of around 30%. An application, with which the user can take pictures of the object he or she would like to recycle, could be developed with an educational purpose to let its user know what material the waste is made of, and with this information be able to throw the waste in the right bin. / För att hjälpa till att motverka människans negativa påverkan på miljön kommer detta projekt handla om att undersöka hur man kan göra det enklare att källsortera. Grundidén är att använda bildigenkänning för att identifiera vilket återvinningsbart material som objektet i bilden består av. Ett stort dataset med bilder indelade i olika återvinningsbara material, kallat Trashnet, analyserades med hjälp av Google Cloud Vision, vilket är ett API för bildigenkänning och inte specifikt igenkänning av material. Med hjälp av Tensorflow skapades ett neuralt nätverk som använder utdatan från Google Cloud Vision som indata, vilket i sin tur kan ge ett av fem olika material som utdata; glas, plast, papper, metall eller brännbart. Nätverket lärde sig hur olika ordkombinationer från Google Cloud Vision implikerade ett av de fem materialen. Nätverkets indata-lager består av de 518 unika orden som Google Cloud Vision sammanlagt gav som utdata efter att ha analyserade Trashnets dataset. Dessa ord körs igenom två dolda lager, vilka båda består av 1000 noder var, innan det sista lagret, som är ett ”one hot”-utdatalager. Detta nätverk fick en träffsäkerhet på cirka 60%, vilket slog Google Cloud Visions träffsäkerhet på cirka 30%. Detta skulle kunna användas i en applikation, där användaren tar en bild på det skräp som önskas återvinnas, som utvecklas i utbildningssyfte att lära användaren vilket material dennes återvinningsbara föremål är gjort av, och med denna information bättre kunna källsortera.
2

Separation and Extraction of Valuable Information From Digital Receipts Using Google Cloud Vision OCR.

Johansson, Elias January 2019 (has links)
Automatization is a desirable feature in many business areas. Manually extracting information from a physical object such as a receipt is something that can be automated to save resources for a company or a private person. In this paper the process will be described of combining an already existing OCR engine with a developed python script to achieve data extraction of valuable information from a digital image of a receipt. Values such as VAT, VAT%, date, total-, gross-, and net-cost; will be considered as valuable information. This is a feature that has already been implemented in existing applications. However, the company that I have done this project for are interested in creating their own version. This project is an experiment to see if it is possible to implement such an application using restricted resources. To develop a program that can extract the information mentioned above. In this paper you will be guided though the process of the development of the program. As well as indulging in the mindset, findings and the steps taken to overcome the problems encountered along the way. The program achieved a success rate of 86.6% in extracting the most valuable information: total cost, VAT% and date from a set of 53 receipts originated from 34 separate establishments.
3

Analogue meters in a digital world : Minimizing data size when offloading OCR processes

Davidsson, Robin, Sjölander, Fredrik January 2022 (has links)
Introduction: Instead of replacing existing analogue water meters with Internet of Things (IoT) connected substitutes, an alternative would be to attach an IoT connected module to the analogue water meter that optically reads the meter value using Optical Character Recognition (OCR). Such a module would need to be battery-powered given that access to the electrical grid is typically limited near water meters. Research has shown that offloading the OCR process can reduce the power dissipation from the battery, and that this dissipation can be reduced even further by reducing the amount of data that is transmitted.  Purpose: For the sake of minimising energy consumption in the proposed solution, the purpose of the study is to find out to what extent it is possible to reduce an input image’s file size by means of resolution, colour depth, and compression before the Google Cloud Vision OCR engine no longer returns feasible results.   Method and implementation: 250 images of analogue water meter values were processed by the Google Vision Cloud OCR through 38 000 different combinations of resolution, colour depth, and upscaling.  Results: The highest rate of successful OCR readings with a minimal file size were found among images within a range of resolutions between 133 x 22 to 163 x 27 pixels and colour depths between 1- and 2-bits/pixel.  Conclusion: The study shows that there is a potential for minimising data sizes, and thereby energy consumption, by offloading the OCR process by means of transmitting images of minimal file size.
4

REGTEST - an Automatic & Adaptive GUI Regression Testing Tool.

Forsgren, Robert, Petersson Vasquez, Erik January 2018 (has links)
Software testing is something that is very common and is done to increase the quality of and confidence in a software. In this report, an idea is proposed to create a software for GUI regression testing which uses image recognition to perform steps from test cases. The problem that exists with such a solution is that if a GUI has had changes made to it, then many test cases might break. For this reason, REGTEST was created which is a GUI regression testing tool that is able to handle one type of change that has been made to the GUI component, such as a change in color, shape, location or text. This type of solution is interesting because setting up tests with such a tool can be very fast and easy, but one previously big drawback of using image recognition for GUI testing is that it has not been able to handle changes well. It can be compared to tools that use IDs to perform a test where the actual visualization of a GUI component does not matter; It only matters that the ID stays the same; however, when using such tools, it either requires underlying knowledge of the GUI component naming conventions or the use of tools which automatically constructs XPath queries for the components. To verify that REGTEST can work as well as existing tools a comparison was made against two professional tools called Ranorex and Kantu. In those tests, REGTEST proved very successful and performed close to, or better than the other software.
5

Mobilní systém pro rozpoznání textu na iOS / Mobile System for Text Recognition on iOS

Bobák, Petr January 2017 (has links)
This thesis describes a development of a modern client-server application for text recognition on iOS platform. The reader is acquainted with common principles of a client-server model, including its known architecture styles, and with a distribution of logical layers between both sides of the model. After that the thesis depicts current trends and examples of suitable technologies for creating an application programming interface of a web server. Possible ways of text recognition on the server side are discussed as well. In context of a client side, the thesis provides an insight into iOS platform and a few important concepts in iOS application development. Following implementation of the server side is stressed to be reusable as much as possible for different kinds of use cases. Last but not least, the thesis provides a simple iOS framework for a direct communication with the recognition server. Finally, an application for evaluation of food ingredients from a packaging material is implemented as an example of usage.

Page generated in 0.0977 seconds