Spelling suggestions: "subject:"image recognition"" "subject:"lmage recognition""
31 |
System Agnostic GUI Testing : Analysis of Augmented Image Recognition TestingAmundberg, Joel, Moberg, Martin January 2021 (has links)
No description available.
|
32 |
Positioning and tracking using image recognition and triangulationBoström, Viktor January 2021 (has links)
Triangulation is used in a wide range of applications of position estimation. Usually it is donebymeasuring angles by hand to estimate positions in land surveying, navigation and astronomy.Withthe rise of image recognition arises the possibility to triangulate automatically. The aim ofthis thesisis to use the image recognition camera Pixy2 to triangulate a target i threedimensions. It is basedon previous projects on the topic to extend the system to estimatepositions over a larger spaceusing more Pixy2s. The setup used five Pixy2s with pan-tilt kitsand one Raspberry Pi 4 B. Somelimitations to the hardware was discovered, limiting the extentof the space in which triangulationcould be successfully performed. Furthermore, there weresome issues with the image recognitionalgorithm in the environment positioning was performed.The thesis was successful in that it managesto triangulate positions over a larger area thanprevious projects and in all three dimensions. Thesystem could also follow a target’s trajectory,albeit, there were some gaps in the measurements.
|
33 |
A Comparative study of cancer detection models using deep learningOmar Ali, Nasra January 2020 (has links)
Leukemi är en form av cancer som kan vara en dödlig sjukdom. För att rehabilitera och behandla sjukdomen krävs det en korrekt och tidig diagnostisering. För att minska väntetiden för testresultaten har de ordinära metoderna transformerats till automatiserade datorverktyg som kan analyser och diagnostisera symtom.I detta arbete, utfördes det en komparativ studie. Det man jämförde var två olika metoder som detekterar leukemia. Den ena metoden är en genetisk sekvenserings metod som är en binär klassificering och den andra metoden en bildbehandlings metod som är en fler-klassad klassificeringsmodell. Modellerna hade olika inmatningsvärden, däremot använde sig de båda av Convolutional neural network (CNN) som nätverksarkitektur och fördelade datavärdena med en 3-way cross-validation teknik. Utvärderings metoderna för att analysera resultaten är learning curves, confusion matrix och klassifikation rapport. Resultaten visade att den genetiska sekvenserings metoden hade fler antal värden som var korrekt förutsagda med 98 % noggrannhet. Den presterade bättre än bildbehandlings metoden som hade värde på 81% noggrannhet. Storlek på de olika datauppsättningar kan vara en orsak till algoritmernas olika testresultat. / Leukemia is a form of cancer that can be a fatal disease, and to rehabilitate and treat it requires a correct and early diagnosis. Standard methods have transformed into automated computer tools for analyzing, diagnosing, and predicting symptoms.In this work, a comparison study was performed by comparing two different leukemia detection methods. The methods were a genomic sequencing method, which is a binary classification model and a multi-class classification model, which was an images-processing method. The methods had different input values. However, both of them used a Convolutional neural network (CNN) as network architecture. They also split their datasets using 3-way cross-validation. The evaluation methods for analyzing the results were learning curves, confusion matrix, and classification report. The results showed that the genome model had better performance and had several numbers of values that were correctly predicted with a total accuracy of 98%. This value was compared to the image processing method results that have a value of 81% total accuracy. The size of the different data sets can be a cause of the different test results of the algorithms.
|
34 |
Application of Analogical Reasoning for Use in Visual Knowledge ExtractionCombs, Kara Lian January 2021 (has links)
No description available.
|
35 |
Object Recognition in Satellite imagesusing improved ConvolutionalRecurrent Neural NetworkNATTALA, TARUN January 2023 (has links)
Background:The background of this research lies in detecting the images from satellites. The recognition of images from satellites has become increasingly importantdue to the vast amount of data that can be obtained from satellites. This thesisaims to develop a method for the recognition of images from satellites using machinelearning techniques. Objective:The main objective of this thesis is a unique approach to recognizingthe data with a CRNN algorithm that involves image recognition in satellite imagesusing machine learning, specifically the CRNN (Convolutional Recurrent Neural Network) architecture. The main task is classifying the images accurately, and this isachieved by utilizing object classification algorithms. The CRNN architecture ischosen because it can effectively extract features from satellite images using Convolutional Blocks and leverage the great memory power of the Long Short-TermMemory (LSTM) networks to connect the extracted features efficiently. The connected features improve the accuracy of our model significantly. Method:The proposed method involves doing a literature review to find currentimage recognition models and then experimentation by training a CRNN, CNN andRNN and then comparing their performance using metrics mentioned in the thesis work. Results:The performance of the proposed method is evaluated using various metrics, including precision, recall, F1 score and inference speed, on a large dataset oflabeled images. The results indicate that high accuracy is achieved in detecting andclassifying objects in satellite images through our approach. The potential utilization of our proposed method can span various applications such as environmentalmonitoring, urban planning, and disaster management. Conclusion:The classification on the satellite images is performed using the 2 datasetsfor ships and cars. The proposed architectures are CRNN, CNN, and RNN. These3 models are compared in order to find the best performing algorithm. The resultsindicate that CRNN has the best accuracy and precision and F1 score and inferencespeed, indicating a strong performance by the CRNN. Keywords: Comparison of CRNN, CNN, and RNN, Image recognition, MachineLearning, Algorithms,You Only Look Once. Version3, Satellite images, Aerial Images, Deep Learning
|
36 |
A Small Classification Experiment Between Dolls and Humans With CNNReinders, Ylva, Runnstrand, Josefin January 2021 (has links)
This study is about a small experiment using CNNmodels to see how well they differentiate between dolls andhumans. The experiment used two different kinds of CNNmodels one which was built after a classic model and one morerudimental model. The models were tested on how accuratelythey predicted the right answer. The experiment was a threeclassedproblem and had a set of different parameters to testwhat would make it harder for the system to classify the imagescorrectly. The original images were digitally enhanced to testdifferent conditions. The models were tested on a dataset withnegative images of the original images, one set with highercontrast than the original, one set with different light conditions,one set with higher brightness and three different levels of lowresolution on the images. The study concludes that brightness andlighting are the two most difficult conditions. The contours in theimage are the most important part for successful classification. / Studien är på ett litet experiment med CNNmodellerför att se hur väl de skiljer mellan dockor ochmänniskor. Experimentet använder två olika typer av CNNmodeller,en som byggdes efter en klassisk modell och en merrudimentär modell. Modellerna testades på hur exakt de kanbestämma de olika klasserna. Experimentet var ett treklassproblem och bilderna testades med olika typer av förhållanden,för att se vad som skulle göra det svårare för modellen attklassificera bilderna korrekt. Original bilderna gjordes om föratt studera olika typer av förhållanden. Modellerna testades på ett dataset med negativa bilder av originalbilderna, enuppsättning med högre kontrast än originalet, en uppsättningmed olika ljusförhållanden, en uppsättning med högre ljusstyrkaoch tre olika nivåer med låg upplösning av bilderna. I studiendrogs slutsatsen att ljusstyrka och belysning är de två svårasteförhållandena. Konturerna på objekten i bilden är den viktigastefaktorn för en framgångsrik klassificering. / Kandidatexjobb i elektroteknik 2021, KTH, Stockholm
|
37 |
VISUAL AND SEMANTIC KNOWLEDGE TRANSFER FOR NOVEL TASKSYe, Meng January 2019 (has links)
Data is a critical component in a supervised machine learning system. Many successful applications of learning systems on various tasks are based on a large amount of labeled data. For example, deep convolutional neural networks have surpassed human performance on ImageNet classification, which consists of millions of labeled images. However, one challenge in conventional supervised learning systems is their generalization ability. Once a model is trained on a specific dataset, it can only perform the task on those \emph{seen} classes and cannot be used for novel \emph{unseen} classes. In order to make the model work on new classes, one has to collect and label new data and then re-train the model. However, collecting data and labeling them is labor-intensive and costly, in some cases, it is even impossible. Also, there is an enormous amount of different tasks in the real world. It is not applicable to create a dataset for each of them. These problems raise the need for Transfer Learning, which is aimed at using data from the \emph{source} domain to improve the performance of a model on the \emph{target} domain, and these two domains have different data or different tasks. One specific case of transfer learning is Zero-Shot Learning. It deals with the situation where \emph{source} domain and \emph{target} domain have the same data distribution but do not have the same set of classes. For example, a model is given animal images of `cat' and `dog' for training and will be tested on classifying 'tiger' and 'wolf' images, which it has never seen. Different from conventional supervised learning, Zero-Shot Learning does not require training data in the \emph{target} domain to perform classification. This property gives ZSL the potential to be broadly applied in various applications where a system is expected to tackle unexpected situations. In this dissertation, we develop algorithms that can help a model effectively transfer visual and semantic knowledge learned from \emph{source} task to \emph{target} task. More specifically, first we develop a model that learns a uniform visual representation of semantic attributes, which help alleviate the domain shift problem in Zero-Shot Learning. Second, we develop an ensemble network architecture with a progressive training scheme, which transfers \emph{source} domain knowledge to the \emph{target} domain in an end-to-end manner. Lastly, we move a step beyond ZSL and explore Label-less Classification, which transfers knowledge from pre-trained object detectors into scene classification tasks. Our label-less classification takes advantage of word embeddings trained from unorganized online text, thus eliminating the need for expert-defined semantic attributes for each class. Through comprehensive experiments, we show that the proposed methods can effectively transfer visual and semantic knowledge between tasks, and achieve state-of-the-art performances on standard datasets. / Computer and Information Science
|
38 |
Optimering av operationer: Ai-implementering och bildigenkänning i små och medelstora företag / Optimizing Operations: AI Implementation and Image Recognition in SMEsHoang, David, Kochar, Bawan January 2024 (has links)
The purpose of this thesis is to investigate factors that influence the implementation of AI, particularly related to image recognition in SMEs. An inductive approach was used where the data collected was through observations, documentation, meetings and physical artefacts. This provided context about the company's ambitions and goals but also detailed and rich information about their processes that they want to streamline. The theoretical framework formed the basis of factors to consider when implementing AI and these were always related to SMEs' limitations and opportunities. This was also done in the image recognition process but linked more with the empirical data to ground and highlight theoretical assumptions in a realistic scenario. The results of this study contribute theoretically by developing a framework for understanding AI implementation in SMEs. In practice, it provides guidelines for Jensen - Group and other SMEs that want to explore AI implementation with image recognition to achieve their business goals. The results suggest that although AI implementation in SMEs like Jensen - Group brings considerable benefits in terms of operational efficiency, cost savings and customer experience, but also challenges related to financial costs, competence, laws and regulations. Effective strategies and a thorough understanding of these aspects are critical to successful AI integration in SMEs. / Syftet med den här kandidatuppsatsen är att undersöka faktorer som påverkar implementeringen utav AI, speciellt relaterat till bildigenkänning i Små och medelstora företag. En induktiv approach användes där data som samlades in var genom observationer, dokumentationer, möten och fysiska artefakt. Detta gav kontext om företagets ambitioner och mål men även detaljerad och rik information om deras processer som de vill effektivisera. Det teoretiska ramverket utgjorde grunden till faktorer man bör tänka på när man implementerar AI och dessa relaterades alltid till SMEs begränsningar och möjligheter. Detta gjordes även i bildigenkännings processen men kopplades mer med den empiriska data för att grunda och framhäva teoretiska antaganden i ett realistiskt scenario. Resultaten av denna studie bidrar teoretiskt genom att utveckla ett ramverk för att förstå AI-implementering i små och medelstora företag. I praktiken ger den riktlinjer för Jensen - Group och andra SMEs som vill utforska AI-implementering med bildigenkänning för att uppnå sina affärsmål. Resultaten tyder på att även om AI-implementering i SMEs som Jensen - Group ger avsevärda fördelar i termer av operativ effektivitet, kostnadsbesparingar och kundupplevelse, men också utmaningar relaterade till kostnader, kompetensluckor, lagar och regler. Effektiva strategier och en grundlig förståelse för dessa aspekter är avgörande för en lyckad AI-integrering i små och medelstora företag.
|
39 |
Optical Three-Dimensional Image Matching Using Holographic InformationKim, Taegeun 04 September 2000 (has links)
We present a three-dimensional (3-D) optical image matching technique and location extraction techniques of matched 3-D objects for optical pattern recognition. We first describe the 3-D matching technique based on two-pupil optical heterodyne scanning. A hologram of the 3-D reference object is first created and then represented as one pupil function with the other pupil function being a delta function. The superposition of each beam modulated by the two pupils generates a scanning beam pattern. This beam pattern scans the 3-D target object to be recognized. The output of the scanning system gives out the 2-D correlation of the hologram of the reference object and that of the target object. When the 3-D image of the target object is matched with that of the reference object, the output of the system generates a strong correlation peak. This theory of 3-D holographic matching is analyzed in terms of two-pupil optical scanning. Computer simulation and optical experiment results are presented to reinforce the developed theory.
The second part of the research concerns the extraction of the location of a 3-D image matched object. The proposed system basically performs a correlation of the hologram of a 3-D reference object and that of a 3-D target object, and hence 3-D matching is possible. However, the system does not give out the depth location of matched 3-D target objects directly because the correlation of holograms is a 2-D correlation and hence not 3-D shift invariant. We propose two methods to extract the location of matched 3-D objects directly from the correlation output of the system. One method is to use the optical system that focuses the output correlation pattern along depth and arrives at the 3-D location at the focused location. However, this technique has a drawback in that only the location of 3-D targets that are farther away from the 3-D reference object can be extracted. Thus, in this research, we propose another method in which the extraction of a location for a matched 3-D object is possible without the aforementioned drawback. This method applies the Wigner distribution to the power fringe-adjusted filtered correlation output to extract the 3-D location of a matched object. We analyze the proposed method and present computer simulation and optical experiment results. / Ph. D.
|
40 |
VTQuestAR: An Augmented Reality Mobile Software Application for Virginia Tech Campus VisitorsYao, Zhennan 07 January 2021 (has links)
The main campus of Virginia Polytechnic Institute and State University (Virginia Tech) has more than 120 buildings. The campus visitors face problems recognizing a building, finding a building, obtaining directions from one building to another, and getting information about a building. The exploratory development research described herein resulted in an iPhone / iPad software application (app) named VTQuestAR that provides assistance to the campus visitors by using the Augmented Reality (AR) technology. The Machine Learning (ML) technology is used to recognize a sample of 31 campus buildings in real-time. The VTQuestAR app enables the user to have a visual interactive experience with those 31 campus buildings by superimposing building information on top of the building picture shown through the camera. The app also enables the user to get directions from the current location or a building to another building displayed on a 2D map as well as an AR map. The user can perform complex searches on 122 campus buildings by building name, description, abbreviation, category, address, and year built. The app enables the user to take multimedia notes during a campus visit. Our exploratory development research illustrates the feasibility of using AR and ML in providing much more effective assistance to visitors of any organization. / Master of Science / The main campus of Virginia Polytechnic Institute and State University (Virginia Tech) has more than 120 buildings. The campus visitors face problems recognizing a building, finding a building, obtaining directions from one building to another, and getting information about a building. The exploratory development research described herein resulted in an iPhone / iPad software application named VTQuestAR that provides assistance to the campus visitors by using the Augmented Reality (AR) and Machine Learning (ML) technologies. Our research illustrates the feasibility of using AR and ML in providing much more effective assistance to visitors of any organization.
|
Page generated in 0.1536 seconds