Spelling suggestions: "subject:"rcr"" "subject:"cocr""
91 |
Ocr: A Statistical Model Of Multi-engine Ocr SystemsMcDonald, Mercedes Terre 01 January 2004 (has links)
This thesis is a benchmark performed on three commercial Optical Character Recognition (OCR) engines. The purpose of this benchmark is to characterize the performance of the OCR engines with emphasis on the correlation of errors between each engine. The benchmarks are performed for the evaluation of the effect of a multi-OCR system employing a voting scheme to increase overall recognition accuracy. This is desirable since currently OCR systems are still unable to recognize characters with 100% accuracy. The existing error rates of OCR engines pose a major problem for applications where a single error can possibly effect significant outcomes, such as in legal applications. The results obtained from this benchmark are the primary determining factor in the decision of implementing a voting scheme. The experiment performed displayed a very high accuracy rate for each of these commercial OCR engines. The average accuracy rate found for each engine was near 99.5% based on a less than 6,000 word document. While these error rates are very low, the goal is 100% accuracy in legal applications. Based on the work in this thesis, it has been determined that a simple voting scheme will help to improve the accuracy rate.
|
92 |
Autonomous Repair Of Optical Character Recognition Data Through Simple Voting And Multi-dimensional Indexing TechniquesSprague, Christopher 01 January 2005 (has links)
The three major optical character recognition (OCR) engines (ExperVision, Scansoft OCR, and Abby OCR) in use today are all capable of recognizing text at near perfect percentages. The remaining errors however have proven very difficult to identify within a single engine. Recent research has shown that a comparison between the errors of the three engines proved to have very little correlation, and thus, when used in conjunction, may be useful to increase accuracy of the final result. This document discusses the implementation and results of a simple voting system designed to prove the hypothesis and show a statistical improvement in overall accuracy. Additional aspects of implementing an improved OCR scheme such as dealing with multiple engine data output alignment and recognizing application specific solutions are also addressed in this research. Although voting systems are currently in use by many major OCR engine developers, this research focuses on the addition of a collaborative system which is able to utilize the various positive aspects of multiple engines while also addressing the immediate need for practical industry applications such as litigation and forms processing. Doculex TM, a major developer and leader in the document imaging industry, has provided the funding for this research.
|
93 |
Learning from the Past: The Case of the Weimar Republic : A Proposal for Historical Analysis, Revision and DigitizationDe Paduanis, Giulia January 2022 (has links)
In a world in which current events increasingly evoke episodes from the past and former crises, understanding history becomes fundamental in order to build an informed solution strategy. Nevertheless, one should also recognize to leave one’s contemporary judgment and knowledge in the present, while dissecting the past for valuable insights. In this master thesis, I am submitting a research proposal to fellow students and the research community at large, which includes a case study that, firstly, analyses the Weimar Republic’s newspaper landscape and the lack of an extensive and centralized digitized archive of its widely decentralised press, and, secondly, analyses the transformation of language over time in a newspaper sample from the Aachener Anzeiger. Through the analysed sample and analysis, I wish to highlight the importance of understanding the past so that future adversities can be easier resolved by a combination of distant and close reading techniques. The interest in the history of Weimar Germany is steadily regaining momentum within and outside academia, as several contemporary events seem to establish a certain parallel with this short-lived first attempt at democracy that emerged after the end of the former German Empire. In this thesis, history will be analysed through the digital textual analysis of newspapers. The limitations of this approach will be illustrated and discussed, such as the challenges posed by decentralized archival material, the issues OCR encounters when digitizing the Fraktur typeface, and the deriving importance of digitizing such typeface to avoid historical erasure. Furthermore, the need to render such findings and research accessible to society at large is fundamental, as the contemporary political developments of our times affect everyone, whether they belong to academia or not. In the final chapter, new research pathways will be proposed and discussed, while also considering the case of contemporary history and politics and the essential aspects of digitization and social acceleration of life through technology.
|
94 |
Real time Optical Character Recognition in steel bars using YOLOV5Gattupalli, Monica January 2023 (has links)
Background.Identifying the quality of the products in the manufacturing industry is a challenging task. Manufacturers use needles to print unique numbers on the products to differentiate between good and bad quality products. However, identi- fying these needle printed characters can be difficult. Hence, new technologies like deep learning and optical character recognition (OCR) are used to identify these characters. Objective.The primary ob jective of this thesis is to identify the needle-printed characters on steel bars. This ob jective is divided into two sub-ob jectives. The first sub-ob jective is to identify the region of interest on the steel bars and extract it from the images. The second sub-ob jective is to identify the characters on the steel bars from the extracted images. The YOLOV5 and YOLOV5-obb ob ject detection algorithms are used to achieve these ob jectives. Method. Literature review was performed at first to select the algorithms, then the research was to collect the dataset, which was provided by OVAKO. The dataset included 1000 old images and 3000 new images of steel bars. To answer the RQ2, at first existing OCR techniques were used on the old images which had low accuracy levels. So, the YOLOV5 algorithm was used on old images to detect the region of interest. Different rotation techniques are applied to the cropped images(cropped after the bounding box is detected) no promising result is observed so YOLOV5 at the character level is used in identifying the characters, the results are unsatisfactory. To achieve this, YOLOV5-obb was used on the new images, which resulted in good accuracy levels. Results. Accuracy and mAP are used to assess the performance of OCRs and selected ob ject detection algorithms. The current study proved Existing OCR was also used in the extraction, however, it had an accuracy of 0%, which implies it failed to identify characters. With a mAP of 0.95, YOLOV5 is good at extracting cropped images but fails to identify the characters. When YOLOV5-obb is used for attaining orientation, it achieves a mAP of 0.93. Due to time constraint, the last part of the thesis was not implemented. Conclusion. The present research employed YOLOV5 and YOLOV5-obb ob ject detection algorithms to identify needle-printed characters on steel bars. By first se- lecting the region of interest and then extracting images, the study ob jectives were met. Finally, character-level identification was performed on the old images using the YOLOV5 technique and on the new images using the YOLOV5-obb algorithm, with promising results
|
95 |
TikToks Påverkan: En studie om recensioners påverkan på TikTok-användare : - Inom Elektronisk Word-Of-MouthMaqedonci, Lorita, Voca, Vanesa January 2024 (has links)
Titel: TikToks Påverkan: En studie om recensioners påverkan på TikTok-användare – Inom elektronisk Word-Of-Mouth. Nivå: Examensarbete på grundnivå (kandidatexamen) i ämnet företagsekonomi Författare: Lorita Maqedonci och Vanesa Voca Handledare: Martin Ahlenius Datum: 2024 - Januari Syfte: Syftet med denna studie är att undersöka hur recensioner på TikTok påverkar användare genom elektronisk Word-Of-Mouth (eWOM). Metod: Denna forskning bygger på en deduktiv metod, där en enkätundersökning har utgjort grunden för den empiriska analysen och datainsamlingen. Statistikprogrammet "JASP" har använts för att analysera data, inklusive korrelationsanalyser och deskriptiv statistik. Studiens respondenter omfattar personer som använder TikTok-plattformen. Resultat och slutsats: Studien undersökte hur TikTok-recensioner påverkar köpintentioner och köpbeslut genom elektronisk Word-Of-Mouth (eWOM). Resultaten visade att dagliga användare har högre förtroende för recensioner och är mer benägna att söka produktrekommendationer på TikTok. Även om det fanns kopplingar mellan användarfrekvens, förtroende och förändringar i åsikter om varumärken, var antalet respondenter begränsat. Trots detta ger studien insikter om hur eWOM-recensioner på TikTok påverkar användarnas förtroende, attityder och köpbeteenden. Examensarbetets bidrag: Studiens resultat bidrar till insikter om hur TikTok-recensioner påverkar användarnas förtroende, attityder och köpbeteenden. Andra användare, företag och varumärken på plattformen kan ta lärdom om vad som påverkar konsumenterna och sedan anpassa sitt innehåll. Förslag till fortsatt forskning: Framtida forskning bör inrikta sig på specifika och begränsade populationer, som unga kvinnor inom åldersgruppen 18–24 år, för en djupare förståelse av eWOM-effekter. Semistrukturerade intervjuer kan användas för att utforska nyanserade aspekter av respondenternas tankar och erfarenheter. Dessutom kan forskning fokusera på influencer-marknadsföring på TikTok och undersöka hur olika typer av varumärkessamarbeten påverkar konsumenters förtroende och köpintention. En intressant forskningsfråga är hur kontinuitet och regelbundenhet i varumärkessamarbeten påverkar influencers trovärdighet och förtroende hos konsumenterna. Nyckelord: EWOM, positiv/negativ WOM, e-handel, OCR, IACM, TikTok, och sociala medier / Titel: Impact of TikTok: A study on the influence of reviews on TikTok users – Within electronic Word-Of-Mouth. Level: Student thesis, final assignment for Bachelor Degree in Business Administration Authors: Lorita Maqedonci and Vanesa Voca Supervisor: Martin AhleniusDate: 2024 - January Aim: The purpose of this study is to investigate how reviews on TikTok influence users through electronic Word-Of-Mouth (eWOM). Method: This research is based on a deductive approach, where a survey has served as the foundation for the empirical analysis and data collection. The statistical program "JASP" has been employed to analyze data, including correlation analyses and descriptive statistics. The study's respondents encompass individuals who use the TikTok platform. Result and conclusion: The study investigated how TikTok reviews influence purchase intentions and decisions through electronic Word-Of-Mouth (eWOM). The results indicated that daily users have higher trust in reviews and are more likely to seek product recommendations on TikTok. Although there were associations between user frequency, trust, and changes in brand opinions, the number of respondents was limited. Nevertheless, the study provides insights into how eWOM reviews on TikTok impact users' trust, attitudes, and purchasing behaviors. Contribution of the thesis: The study's findings contribute insights into how TikTok reviews influence users' trust, attitudes, and purchasing behaviors. Other users, companies, and brands on the platform can learn about factors influencing consumers and subsequently tailor their content accordingly. Suggestions for future research: Future research should target specific and limited populations, such as young women within the 18–24 age group, for a deeper understanding of eWOM effects. Semistructured interviews can be employed to explore nuanced aspects of respondents' thoughts and experiences. Additionally, research can focus on influencer marketing on TikTok and examine how different types of brand collaborations impact consumer trust and purchase intentions. An intriguing research question is how continuity and regularity in brand collaborations affect influencers' credibility and consumer trust. Key words: EWOM, positive/negative WOM, e-commerce, OCR, IACM, TikTok, and social media.
|
96 |
Recognition of off-line printed Arabic text using Hidden Markov Models.Al-Muhtaseb, Husni A., Mahmoud, Sabri A., Qahwaji, Rami S.R. January 2008 (has links)
yes / This paper describes a technique for automatic recognition of off-line printed Arabic text using Hidden Markov Models. In this work different sizes of overlapping and non-overlapping hierarchical windows are used to generate 16 features from each vertical sliding strip. Eight different Arabic fonts were used for testing (viz. Arial, Tahoma, Akhbar, Thuluth, Naskh, Simplified Arabic, Andalus, and Traditional Arabic). It was experimentally proven that different fonts have their highest recognition rates at different numbers of states (5 or 7) and codebook sizes (128 or 256).
Arabic text is cursive, and each character may have up to four different shapes based on its location in a word. This research work considered each shape as a different class, resulting in a total of 126 classes (compared to 28 Arabic letters). The achieved average recognition rates were between 98.08% and 99.89% for the eight experimental fonts.
The main contributions of this work are the novel hierarchical sliding window technique using only 16 features for each sliding window, considering each shape of Arabic characters as a separate class, bypassing the need for segmenting Arabic text, and its applicability to other languages.
|
97 |
Mobilt läsverktyg med OCR-teknologi och textmodifiering : Läsverktyg för personer med ADHD och dyslexiKask, Ella January 2024 (has links)
In Sweden approximately one in four to one in five individuals experience readingdifficulties for various reasons. Existing tools on the market are primarily adaptedfor digital text, but some applications allow users to photograph or upload images oftext and have it read aloud via text-to-speech through OCR technology.Unfortunately, these functions are limited to text-to-speech only and are primarilyused in educational settings. For individuals with ADHD or dyslexia, who mayexhibit similar but not identical difficulties, these tools may be insufficient.This work presents a prototype for a mobile application developed with a user-centered design to extend functionality beyond what is offered in today’s tools. Theprototype enables the customization of the text’s visual presentation directly onmobile devices, which can improve the reading experience not only for individualswith specific reading difficulties but also for the general population. This approach isparticularly advantageous for those with reading difficulties as it allows them toadjust the appearance of their text to best facilitate their reading.The prototype uses OCR technology to extract text from images, allowing users tomanipulate the text’s visual presentation to enhance readability. This includessettings such as character and line spacing, background color, and word highlighting.Research indicates that such adaptations can facilitate reading especially for thoseexperiencing reading difficulties, such as dyslexia and frequently ADHD, byproviding access and less strenuous visual text presentation.By combining technical solutions with social value, this project aims to reduce thegap in information accessibility and support individual independence and integrationinto society. The proposed mobile application represents a step forward in thedevelopment of more inclusive and adaptable reading tools, which are not confinedto digital or educational environments but can be used in a wide range of everydaysituations. / I Sverige har ungefär var fjärde till var femte person lässvårigheter av olikaanledningar. Befintliga verktyg på marknaden är huvudsakligen anpassade för digitaltext, men några applikationer tillåter användare att fotografera eller ladda upp bilderpå text och få den uppläst via talsyntes genom OCR-teknik. Tyvärr begränsas dessafunktioner till enbart talsyntes och används främst i skolmiljöer. För personer medADHD eller dyslexi, som kan uppvisa liknande men inte identiska lässvårigheter,kan dessa verktyg vara otillräckliga.Syftet med det här arbetet är att undersöka hur en prototyp för en mobilapplikationmed en användarcentrerad design kan utvecklas för att erbjuda en bredarefunktionalitet än den som erbjuds i verktygen idag. Prototypen möjliggör anpassningav textens visuella presentation direkt på mobila enheter, vilket kan förbättraläsupplevelsen inte bara för personer med specifika lässvårigheter utan även förbefolkningen i allmänhet. Detta tillvägagångssätt är särskilt fördelaktigt för dem medlässvårigheter genom att det ger dem möjlighet att justera textens utseende till detsom bäst underlättar deras läsning.Prototypen använder OCR-teknik för att extrahera text från bilder, vilket gör detmöjligt för användare att manipulera textens visuella presentation för att förbättraläsbarheten. Detta inkluderar inställningar som tecken- och radavstånd,bakgrundsfärg, ordframhävning. Forskning indikerar att sådana anpassningar kanunderlätta läsning, speciellt för som upplever lässvårigheter, som dyslexi och intealltför sällan även ADHD, genom att erbjuda tillgång och mindre ansträngandevisuell textpresentation.Genom att kombinera tekniska lösningar med socialt värde strävar detta projektefter att minska tillgänglighetsklyftan till information och stödja individenssjälvständighet samt integrering i samhället. Den föreslagna mobilapplikationenrepresenterar ett steg framåt i utvecklingen av mer inkluderande ochanpassningsbara läsverktyg, som inte enbart är bundna till en digital- eller skolmiljö,utan kan användas i ett brett spektrum av vardagliga situationer.
|
98 |
Extracting Textual Data from Historical Newspaper Scans and its Challenges for 'Guerilla-ProjectsWehrheim, Lino, Liebl, Bernhard, Burghardt, Manuel 11 July 2024 (has links)
In 2022, it is a common place that digital historical newspapers (DHN) have become
increasingly available. Despite the undeniable progress in the supply of DHN and the methods to
perform rigorous quantitative analysis, however, working with DHN still poses various pitfalls,
especially when scholars use data provided by third parties, such as libraries or commercial
providers. Reporting from a current project, we want to share our experiences and communicate
the various problems we faced while working with DHN. After a short project summary, we
present the main problems that we faced in our project and that we think might also be relevant
for other scholars, particularly those who work in small research groups. We arrange these
problems according to an archetype workflow, which is divided into the three steps of corpus
acquisition, corpus evaluation, and corpus preparation. By raising some red flags, we want to call
attention to what we think common DHN related problems, to raise awareness for potential
pitfalls, and, this way, to provide some guidelines for scholars who consider using DHN for their
research.
|
99 |
Controle de cargas conteinerizadas utilizando elementos da cadeia logística segura e do programa brasileiro de Operador Econômico Autorizado (OEA). / Control of containerized cargo using elements of the secure supply chain and the brazilian Authorized Economic Operator program (AEO)Lima, Alexsandro Soares de 13 May 2015 (has links)
O presente trabalho propõe um processo para auxiliar a tarefa de implantação de controles de Cadeia Logística Segura para a importação e exportação de cargas conteinerizadas, transportadas pelo modal rodoviário. Está em consonância com a legislação brasileira atual, no que se refere à Receita Federal do Brasil e demais Órgãos Anuentes. Além disso, inclui, também, as novas diretrizes do Programa Brasileiro de Operador Econômico Autorizado que teve seu início na primeira quinzena de Dezembro de 2014, bem como os aspectos principais do quadro SAFE, da Organização Mundial das Aduanas (OMA) e do programa americano Customs-Trade Partnership Against Terrorism (C-TPAT). O processo proposto no trabalho contempla a instrumentação dos controles e seus principais pontos de integração de dados, estágio em que grande parte dos operadores econômicos atuais se encontra. A proposta justifica-se pela complexidade dos processos de cadeias logísticas, sua importância para o comércio exterior e, portanto, para a economia do país, que exigem um aperfeiçoamento constante para atender à competitividade crescente dos mercados, controlar e gerenciar riscos e incertezas dos tempos da globalização. A metodologia do trabalho de pesquisa constou de estudos sobre o significado de cadeia logística segura, legislações e normatizações existentes, principais tecnologias utilizadas no Brasil e no mundo e suas estratégias de integração de sistemas, com enfoque em alguns projetos de gestão já existentes no país. O porto de Santos foi tomado como campo principal de pesquisa. O trabalho evidenciou a importância da presença de três características fundamentais em um processo de cadeia logística segura: ser instrumentado, integrado e inteligente. Considera-se que, a partir do processo proposto, será possível aumentar o grau de inteligência de uma cadeia logística, de forma a gerenciar e mitigar os potenciais riscos de forma mais racional. / This assay proposes a process to assist the task of implantation of controls of Secure Supply Chain to importation and exportation of containerized cargo, transported by road. It is pursuant to the current Brazilian laws, as it concerns Receita Federal do Brasil and other consenting authorities. In addition, it also includes the new guidelines of the Brazilian Authorized Economic Operator Program, which began on the first fortnight of December 2014, as well as the main aspects of the SAFE Framework, from the World Customs Organization (WCO) and the C-TPAT program (Customs-Trade Partnership Against Terrorism). The proposal is justified by the complexity of the supply chain processes, their importance to the foreign trade and therefore, to the countrys economy, which require constant improvement in order to meet the growing competitiveness within markets and to control and manage risks and uncertainties of globalization times. The methodology of this assay consists of studies on the meaning of secure supply chain, existing laws and standards, main technologies used in Brazil and throughout the world and their strategies of system integration, focusing on some management projects already existing in the country. The Port of Santos was taken as main research field. This assay provides evidence of the importance of three fundamental characteristics in a secure supply chain process: to be instrumented, integrated and intelligent. The proposed methodology contemplates instrumentation of the processes and their key points of integration, a stage in which great part of the existing economic operators can be found currently. From this point on, it will be possible to increase its degree of intelligence in such a way as to manage and mitigate risks more rationally.
|
100 |
Controle de cargas conteinerizadas utilizando elementos da cadeia logística segura e do programa brasileiro de Operador Econômico Autorizado (OEA). / Control of containerized cargo using elements of the secure supply chain and the brazilian Authorized Economic Operator program (AEO)Alexsandro Soares de Lima 13 May 2015 (has links)
O presente trabalho propõe um processo para auxiliar a tarefa de implantação de controles de Cadeia Logística Segura para a importação e exportação de cargas conteinerizadas, transportadas pelo modal rodoviário. Está em consonância com a legislação brasileira atual, no que se refere à Receita Federal do Brasil e demais Órgãos Anuentes. Além disso, inclui, também, as novas diretrizes do Programa Brasileiro de Operador Econômico Autorizado que teve seu início na primeira quinzena de Dezembro de 2014, bem como os aspectos principais do quadro SAFE, da Organização Mundial das Aduanas (OMA) e do programa americano Customs-Trade Partnership Against Terrorism (C-TPAT). O processo proposto no trabalho contempla a instrumentação dos controles e seus principais pontos de integração de dados, estágio em que grande parte dos operadores econômicos atuais se encontra. A proposta justifica-se pela complexidade dos processos de cadeias logísticas, sua importância para o comércio exterior e, portanto, para a economia do país, que exigem um aperfeiçoamento constante para atender à competitividade crescente dos mercados, controlar e gerenciar riscos e incertezas dos tempos da globalização. A metodologia do trabalho de pesquisa constou de estudos sobre o significado de cadeia logística segura, legislações e normatizações existentes, principais tecnologias utilizadas no Brasil e no mundo e suas estratégias de integração de sistemas, com enfoque em alguns projetos de gestão já existentes no país. O porto de Santos foi tomado como campo principal de pesquisa. O trabalho evidenciou a importância da presença de três características fundamentais em um processo de cadeia logística segura: ser instrumentado, integrado e inteligente. Considera-se que, a partir do processo proposto, será possível aumentar o grau de inteligência de uma cadeia logística, de forma a gerenciar e mitigar os potenciais riscos de forma mais racional. / This assay proposes a process to assist the task of implantation of controls of Secure Supply Chain to importation and exportation of containerized cargo, transported by road. It is pursuant to the current Brazilian laws, as it concerns Receita Federal do Brasil and other consenting authorities. In addition, it also includes the new guidelines of the Brazilian Authorized Economic Operator Program, which began on the first fortnight of December 2014, as well as the main aspects of the SAFE Framework, from the World Customs Organization (WCO) and the C-TPAT program (Customs-Trade Partnership Against Terrorism). The proposal is justified by the complexity of the supply chain processes, their importance to the foreign trade and therefore, to the countrys economy, which require constant improvement in order to meet the growing competitiveness within markets and to control and manage risks and uncertainties of globalization times. The methodology of this assay consists of studies on the meaning of secure supply chain, existing laws and standards, main technologies used in Brazil and throughout the world and their strategies of system integration, focusing on some management projects already existing in the country. The Port of Santos was taken as main research field. This assay provides evidence of the importance of three fundamental characteristics in a secure supply chain process: to be instrumented, integrated and intelligent. The proposed methodology contemplates instrumentation of the processes and their key points of integration, a stage in which great part of the existing economic operators can be found currently. From this point on, it will be possible to increase its degree of intelligence in such a way as to manage and mitigate risks more rationally.
|
Page generated in 0.029 seconds