Global ETD Search

Return to search

多語言的場景文字偵測 / Multilingual Scene Text Detection

影像中的文字訊息，通常包含著與場景內容相關的重要資訊，如地點、名稱、指示、警告等，因此如何有效地在影像中擷取文字區塊，進而解讀其意義，成為近來電腦視覺領域中相當受矚目的議題。然而在眾多的場景文字偵測方法裡，絕大多數是以英文為偵測目標語言，中文方面的研究相當稀少，而且辨識率遠不及英文。因此，本論文提出以中文和英文為偵測目標語言的方法，分成以下四個主要程序：一、前處理，利用雙邊濾波器(Bilateral filter)使文字區域更加穩定；二、候選文字資訊擷取，考慮文字特徵，選用Canny 邊緣偵測和最大穩定極值區域(Maximally Stable Extremal Region)，分別提取文字邊緣和區域特徵，並結合兩者來優化擷取的資訊；三、文字連結，依中文字結構和直式、橫式兩種書寫方向，設置幾何條件連結候選文字字串；四、候選字串分類，以SVM加入影像中文字的特徵，分類文字字串和非文字字串。使得此方法可以偵測中文和英文兩種語言，並且達到不錯的辨識效果。 / Text messages in an image usually contain useful information related to the scene, such as location, name, direction and warning. As such, robust and efficient scene text detection has gained increasing attention in the area of computer vision recently. However, most existing scene text detection methods are devised to process Latin-based languages. For the few researches that reported the investigation of Chinese text, the detection rate was inferior to the result for English.
In this thesis, we propose a multilingual scene text detection algorithm for both Chinese and English. The method comprises of four stages: 1. Preprocessing by bilateral filter to make the text region more stable. 2. Extracting candidate text edge and region using Canny edge detector and Maximally Stable Extremal Region (MSER) respectively. Then combine these two features to achieve more robust results. 3. Linking candidate characters: considering both horizontal and vertical direction, character candidates are clustered into text candidates by using geometrical constraints. 4. Classifying candidate texts using support vector machine (SVM), the text and non-text areas are separated. Experimental results show that the proposed method detects both Chinese and English texts, and achieve satisfactory performance compared to those approaches designed only for English detection.

http://thesis.lib.nccu.edu.tw/cgi-bin/cdrfb3/gsweb.cgi?o=dstdcdr&i=sid=%22G0101753021%22.

Maximally Stable Extremal Region(MSER)

Identifer	oai:union.ndltd.org:CHENGCHI/G0101753021
Creators	梁苡萱, Liang, Yi Hsuan
Publisher	國立政治大學
Source Sets	National Chengchi University Libraries
Language	中文
Detected Language	English
Type	text
Rights	Copyright © nccu library on behalf of the copyright holders

Page generated in 0.0021 seconds

多語言的場景文字偵測 / Multilingual Scene Text Detection

Description

Links & Downloads

Tags

Additional Fields