Spelling suggestions: "subject:"attern recognition systems."" "subject:"battern recognition systems.""
411 |
Determining the Effectiveness of Human Interaction in Human-in-the-Loop Systems by Using Mental StatesUnknown Date (has links)
A self-adaptive software is developed to predict the stock market. It’s Stock
Prediction Engine functions autonomously when its skill-set suffices to achieve its goal,
and it includes human-in-the-loop when it recognizes conditions benefiting from more
complex, expert human intervention. Key to the system is a module that decides of
human participation. It works by monitoring three mental states unobtrusively and in real
time with Electroencephalography (EEG). The mental states are drawn from the
Opportunity-Willingness-Capability (OWC) model. This research demonstrates that the
three mental states are predictive of whether the Human Computer Interaction System
functions better autonomously (human with low scores on opportunity and/or
willingness, capability) or with the human-in-the-loop, with willingness carrying the
largest predictive power. This transdisciplinary software engineering research
exemplifies the next step of self-adaptive systems in which human and computer benefit from optimized autonomous and cooperative interactions, and in which neural inputs
allow for unobtrusive pre-interactions. / Includes bibliography. / Thesis (M.S.)--Florida Atlantic University, 2016. / FAU Electronic Theses and Dissertations Collection
|
412 |
Shape-based image retrieval in iconic image databases.January 1999 (has links)
by Chan Yuk Ming. / Thesis (M.Phil.)--Chinese University of Hong Kong, 1999. / Includes bibliographical references (leaves 117-124). / Abstract also in Chinese. / Chapter 1 --- Introduction --- p.1 / Chapter 1.1 --- Content-based Image Retrieval --- p.3 / Chapter 1.2 --- Designing a Shape-based Image Retrieval System --- p.4 / Chapter 1.3 --- Information on Trademark --- p.6 / Chapter 1.3.1 --- What is a Trademark? --- p.6 / Chapter 1.3.2 --- Search for Conflicting Trademarks --- p.7 / Chapter 1.3.3 --- Research Scope --- p.8 / Chapter 1.4 --- Information on Chinese Cursive Script Character --- p.9 / Chapter 1.5 --- Problem Definition --- p.9 / Chapter 1.6 --- Contributions --- p.11 / Chapter 1.7 --- Thesis Organization --- p.13 / Chapter 2 --- Literature Review --- p.14 / Chapter 2.1 --- Trademark Retrieval using QBIC Technology --- p.14 / Chapter 2.2 --- STAR --- p.16 / Chapter 2.3 --- ARTISAN --- p.17 / Chapter 2.4 --- Trademark Retrieval using a Visually Salient Feature --- p.18 / Chapter 2.5 --- Trademark Recognition using Closed Contours --- p.19 / Chapter 2.6 --- Trademark Retrieval using a Two Stage Hierarchy --- p.19 / Chapter 2.7 --- Logo Matching using Negative Shape Features --- p.21 / Chapter 2.8 --- Chapter Summary --- p.22 / Chapter 3 --- Background on Shape Representation and Matching --- p.24 / Chapter 3.1 --- Simple Geometric Features --- p.25 / Chapter 3.1.1 --- Circularity --- p.25 / Chapter 3.1.2 --- Rectangularity --- p.26 / Chapter 3.1.3 --- Hole Area Ratio --- p.27 / Chapter 3.1.4 --- Horizontal Gap Ratio --- p.27 / Chapter 3.1.5 --- Vertical Gap Ratio --- p.28 / Chapter 3.1.6 --- Central Moments --- p.28 / Chapter 3.1.7 --- Major Axis Orientation --- p.29 / Chapter 3.1.8 --- Eccentricity --- p.30 / Chapter 3.2 --- Fourier Descriptors --- p.30 / Chapter 3.3 --- Chain Codes --- p.31 / Chapter 3.4 --- Seven Invariant Moments --- p.33 / Chapter 3.5 --- Zernike Moments --- p.35 / Chapter 3.6 --- Edge Direction Histogram --- p.36 / Chapter 3.7 --- Curvature Scale Space Representation --- p.37 / Chapter 3.8 --- Chapter Summary --- p.39 / Chapter 4 --- Genetic Algorithm for Weight Assignment --- p.42 / Chapter 4.1 --- Genetic Algorithm (GA) --- p.42 / Chapter 4.1.1 --- Basic Idea --- p.43 / Chapter 4.1.2 --- Genetic Operators --- p.44 / Chapter 4.2 --- Why GA? --- p.45 / Chapter 4.3 --- Weight Assignment Problem --- p.46 / Chapter 4.3.1 --- Integration of Image Attributes --- p.46 / Chapter 4.4 --- Proposed Solution --- p.47 / Chapter 4.4.1 --- Formalization --- p.47 / Chapter 4.4.2 --- Proposed Genetic Algorithm --- p.43 / Chapter 4.5 --- Chapter Summary --- p.49 / Chapter 5 --- Shape-based Trademark Image Retrieval System --- p.50 / Chapter 5.1 --- Problems on Existing Methods --- p.50 / Chapter 5.1.1 --- Edge Direction Histogram --- p.51 / Chapter 5.1.2 --- Boundary Based Techniques --- p.52 / Chapter 5.2 --- Proposed Solution --- p.53 / Chapter 5.2.1 --- Image Preprocessing --- p.53 / Chapter 5.2.2 --- Automatic Feature Extraction --- p.54 / Chapter 5.2.3 --- Approximated Boundary --- p.55 / Chapter 5.2.4 --- Integration of Shape Features and Query Processing --- p.58 / Chapter 5.3 --- Experimental Results --- p.58 / Chapter 5.3.1 --- Experiment 1: Weight Assignment using Genetic Algorithm --- p.59 / Chapter 5.3.2 --- Experiment 2: Speed on Feature Extraction and Retrieval --- p.62 / Chapter 5.3.3 --- Experiment 3: Evaluation by Precision --- p.63 / Chapter 5.3.4 --- Experiment 4: Evaluation by Recall for Deformed Images --- p.64 / Chapter 5.3.5 --- Experiment 5: Evaluation by Recall for Hand Drawn Query Trademarks --- p.66 / Chapter 5.3.6 --- "Experiment 6: Evaluation by Recall for Rotated, Scaled and Mirrored Images" --- p.66 / Chapter 5.3.7 --- Experiment 7: Comparison of Different Integration Methods --- p.68 / Chapter 5.4 --- Chapter Summary --- p.71 / Chapter 6 --- Shape-based Chinese Cursive Script Character Image Retrieval System --- p.72 / Chapter 6.1 --- Comparison to Trademark Retrieval Problem --- p.79 / Chapter 6.1.1 --- Feature Selection --- p.73 / Chapter 6.1.2 --- Speed of System --- p.73 / Chapter 6.1.3 --- Variation of Style --- p.73 / Chapter 6.2 --- Target of the Research --- p.74 / Chapter 6.3 --- Proposed Solution --- p.75 / Chapter 6.3.1 --- Image Preprocessing --- p.75 / Chapter 6.3.2 --- Automatic Feature Extraction --- p.76 / Chapter 6.3.3 --- Thinned Image and Linearly Normalized Image --- p.76 / Chapter 6.3.4 --- Edge Directions --- p.77 / Chapter 6.3.5 --- Integration of Shape Features --- p.78 / Chapter 6.4 --- Experimental Results --- p.79 / Chapter 6.4.1 --- Experiment 8: Weight Assignment using Genetic Algorithm --- p.79 / Chapter 6.4.2 --- Experiment 9: Speed on Feature Extraction and Retrieval --- p.81 / Chapter 6.4.3 --- Experiment 10: Evaluation by Recall for Deformed Images --- p.82 / Chapter 6.4.4 --- Experiment 11: Evaluation by Recall for Rotated and Scaled Images --- p.83 / Chapter 6.4.5 --- Experiment 12: Comparison of Different Integration Methods --- p.85 / Chapter 6.5 --- Chapter Summary --- p.87 / Chapter 7 --- Conclusion --- p.88 / Chapter 7.1 --- Summary --- p.88 / Chapter 7.2 --- Future Research --- p.89 / Chapter 7.2.1 --- Limitations --- p.89 / Chapter 7.2.2 --- Future Directions --- p.90 / Chapter A --- A Representative Subset of Trademark Images --- p.91 / Chapter B --- A Representative Subset of Cursive Script Character Images --- p.93 / Chapter C --- Shape Feature Extraction Toolbox for Matlab V53 --- p.95 / Chapter C.l --- central .moment --- p.95 / Chapter C.2 --- centroid --- p.96 / Chapter C.3 --- cir --- p.96 / Chapter C.4 --- ess --- p.97 / Chapter C.5 --- css_match --- p.100 / Chapter C.6 --- ecc --- p.102 / Chapter C.7 --- edge一directions --- p.102 / Chapter C.8 --- fourier-d --- p.105 / Chapter C.9 --- gen_shape --- p.106 / Chapter C.10 --- hu7 --- p.108 / Chapter C.11 --- isclockwise --- p.109 / Chapter C.12 --- moment --- p.110 / Chapter C.13 --- normalized-moment --- p.111 / Chapter C.14 --- orientation --- p.111 / Chapter C.15 --- resample-pts --- p.112 / Chapter C.16 --- rectangularity --- p.113 / Chapter C.17 --- trace-points --- p.114 / Chapter C.18 --- warp-conv --- p.115 / Bibliography --- p.117
|
413 |
Calculating degenerate structures via convex optimization with applications in computer vision and pattern recognition. / CUHK electronic theses & dissertations collectionJanuary 2012 (has links)
在諸多電腦視覺和模式識別的問題中,採集到的圖像和視頻資料通常是高維的。直接計算這些高維資料常常面臨計算可行性和穩定性等方面的困難。然而,現實世界中的資料通常由少數物理因素產生,因而本質上存在退化的結構。例如,它們可以用子空間、子空間的集合、流形或者分層流形等模型來描述。計算並運用這些內在退化結構不僅有助於深入理解問題的本質,而且能夠幫助解決實際應用中的難題。 / 隨著近些年凸優化理論和應用的發展,一些NP難題諸如低稚矩陣的計算和稀疏表示的問題已經有了近乎完美和高效的求解方法。本論文旨在研究如何應用這些技術來計算高維資料中的退化結構,並著重研究子空間和子空間的集合這兩種結構,以及它們在現實應用方面的意義。這些應用包括:人臉圖像的配准、背景分離以及自動植物辨別。 / 在人臉圖像配准的問題中,同一人臉在不同光照下的面部圖像經過逐圖元配准後應位於一個低維的子空間中。基於此假設,我們提出了一個新的圖像配准方法,能夠對某未知人臉的多副不同光照、表情和姿態下的圖像進行聯合配准,使得每一幅面部圖像的圖元與事先訓練的一般人臉模型相匹配。其基本思想是追尋一個低維的且位於一般人臉子空間附近的仿射子空間。相比于傳統的基於外觀模型的配准方法(例如主動外觀模型)依賴于準確的外觀模型的缺點,我們提出的方法僅需要一個一般人臉模型就可以很好地對該未知人臉的多副圖像進行聯合配准,即使該人臉與訓練該模型的樣本相差很大。實驗結果表明,該方法的配准精度在某些情況下接近于理想情形,即:當該目標人臉的模型事先已知時,傳統方法所能夠達到的配准精度。 / In a wide range of computer vision and pattern recognition problems, the captured images and videos often live in high-dimensional observation spaces. Directly computing them may suffer from computational infeasibility and numerical instability. On the other hand, the data in the real world are often generated due to limited number of physical causes, and thus embed degenerate structures in the nature. For instance, they can be modeled by a low-dimensional subspace, a union of subspaces, a manifold or even a manifold stratification. Discovering and harnessing such intrinsic structures not only brings semantic insight into the problems at hand, but also provides critical information to overcome challenges encountered in the practice. / Recent years have witnessed great development in both the theory and application of convex optimization. Efficient and elegant solutions have been found for NP-hard problems such as low-rank matrix recovery and sparse representation. In this thesis, we study the problem of discovering degenerate structures of high-¬dimensional inputs using these techniques. Especially we focus ourselves on low-dimensional subspaces and their unions, and address their application in overcoming the challenges encoun-tered under three practical scenarios: face image alignment, background subtraction and automatic plant identification. / In facial image alignment, we propose a method that jointly brings multiple images of an unseen face into alignment with a pre-trained generic appearance model despite different poses, expressions and illumination conditions of the face in the images. The idea is to pursue an intrinsic affine subspace of the target face that is low-dimensional while at the same time lies close to the generic subspace. Compared with conventional appearance-based methods that rely on accurate appearance mod-els, ours works well with only a generic one and performs much better on unseen faces even if they significantly differ from those for training the generic model. The result is approximately good as that in an idealistic case where a specific model for the target face is provided. / For background subtraction, we propose a background model that captures the changes caused by the background switching among a few configurations, like traffic lights statuses. The background is modeled as a union of low-dimensional subspaces, each characterizing one configuration of the background, and the proposed algorithm automatically switches among them and identifies violating elements as foreground pixels. Moreover, we propose a robust learning approach that can work with foreground-present training samples at the background modeling stage it builds a correct background model with outlying foreground pixels automatically pruned out. This is practically important when foreground-free training samples are difficult to obtain in scenarios such as traffic monitoring. / For automatic plant identification, we propose a novel and practical method that recognizes plants based on leaf shapes extracted from photographs. Different from existing studies that are mostly focused on simple leaves, the proposed method is de-signed to recognize both simple and compound leaves. The key to that is, instead of either measuring geometric features or matching shape features as in conventional methods, we describe leaves by counting on them the numbers of certain shape patterns. The patterns are learned in a way that they form a degenerate polytope (a spe-cial union of affine subspaces) in the feature space, and can simulate, to some extent, the "keys" used by botanists - each pattern reflects a common feature of several dif-ferent species and all the patterns together can form a discriminative rule for recog-nition. Experiments conducted on a variety of datasets show that our algorithm sig-nificantly outperforms the state-of-art methods in terms of recognition accuracy, ef-ficiency and storage, and thus has a good promise for practicing. / In conclusion, our performed studies show that: 1) the visual data with semantic meanings are often not random - although they can be high-dimensional, they typically embed degenerate structures in the observation space. 2) With appropriate assumptions made and clever computational tools developed, these structures can be efficiently and stably calculated. 3) The employment of these intrinsic structures helps overcoming practical challenges and is critical for computer vision and pattern recognition algorithms to achieve good performance. / Detailed summary in vernacular field only. / Detailed summary in vernacular field only. / Detailed summary in vernacular field only. / 在背景分離的問題中,靜態場景在不同光照情形下的背景可以被描述為一個線性子空間。然而在實際應用中,背景的局部和突然的變化有可能違背此假設,尤其是當背景在幾個狀態之間切換的情形下,例如交通燈在不同組合狀態之間切換。為了解決該問題,本論文中提出了一個新的背景模型,它將背景描述為一些子空間的集合,每個子空間對應一個背景狀態。我們將背景分離的問題轉化為稀疏逼近的問題,因此演算法能夠自動在多個狀態中切換並成功檢測出前景物體。此外,本論文提出了一個魯棒的字典學習方法。在訓練背景模型的過程中,它能夠處理含有前景物體的圖像,並在訓練過程中自動將前景部分去掉。這個優點在難以收集完整背景訓練樣本的應用情形(譬如交通監視等)下有明顯的優勢。 / 在植物種類自動辨別的問題中,本論文中提出了一個新的有效方法,它通過提取和對比植物葉片的輪廓對植物進行識別和分類。不同于傳統的基於測量幾何特徵或者在形狀特徵之間配對的方法,我們提出使用葉子上某些外形模式的數量來表達樹葉。這些模式在特徵空間中形成一個退化的多面體結構(一種特殊的仿射空間的集合),而且在某種程度上能夠類比植物學中使用的分類檢索表每個模式都反映了一些不同植物的某個共性,例如某種邊緣、某種形狀、某種子葉的佈局等等;而所有模式組合在一起能夠形成具有很高區分度的分類準則。通過對演算法在四個數據庫上的測試,我們發現本論文提出的方法無論在識別精度還是在效率和存儲方面都相比于目前主流方法有顯著提高,因此具有很好的應用性。 / 總之,我們進行的一些列研究說明:(1) 有意義的視覺資料通常是內在相關的,儘管它們的維度可能很高,但是它們通常都具有某種退化的結構。(2) 合理的假設和運用計算工具可以高效、穩健地發現這些結構。(3) 利用這些結構有助於解決實際應用中的難題,且能夠使得電腦視覺和模式識別演算法達到好的性能。 / Zhao, Cong. / Thesis (Ph.D.)--Chinese University of Hong Kong, 2012. / Includes bibliographical references (leaves 107-121). / Electronic reproduction. Hong Kong : Chinese University of Hong Kong, [2012] System requirements: Adobe Acrobat Reader. Available via World Wide Web. / Abstract also in Chinese. / Dedication --- p.i / Acknowledgements --- p.ii / Abstract --- p.v / Abstract (in Chinese) --- p.viii / Publication List --- p.xi / Nomenclature --- p.xii / Contents --- p.xiv / List of Figures --- p.xviii / Chapter Chapter 1 --- Introduction --- p.1 / Chapter 1.1 --- Motivation --- p.1 / Chapter 1.2 --- Background --- p.2 / Chapter 1.2.1 --- Subspaces --- p.3 / Chapter 1.2.2 --- Unions of Subspaces --- p.6 / Chapter 1.2.3 --- Manifolds and Stratifications --- p.8 / Chapter 1.3 --- Thesis Outline --- p.10 / Chapter Chapter 2 --- Joint Face Image Alignment --- p.13 / Chapter 2.1 --- Introduction --- p.14 / Chapter 2.2 --- Related Works --- p.16 / Chapter 2.3 --- Background --- p.18 / Chapter 2.3.1 --- Active Appearance Model --- p.18 / Chapter 2.3.2 --- Multi-Image Alignment using AAM --- p.20 / Chapter 2.3.3 --- Limitations in Practice --- p.21 / Chapter 2.4 --- The Proposed Method --- p.23 / Chapter 2.4.1 --- Two Important Assumptions --- p.23 / Chapter 2.4.2 --- The Subspace Pursuit Problem --- p.27 / Chapter 2.4.3 --- Reformulation --- p.27 / Chapter 2.4.4 --- Efficient Solution --- p.30 / Chapter 2.4.5 --- Discussions --- p.32 / Chapter 2.5 --- Experiments --- p.34 / Chapter 2.5.1 --- Settings --- p.34 / Chapter 2.5.2 --- Results and Discussions --- p.36 / Chapter 2.6 --- Summary --- p.38 / Chapter Chapter 3 --- Background Subtraction --- p.40 / Chapter 3.1 --- Introduction --- p.41 / Chapter 3.2 --- Related Works --- p.43 / Chapter 3.3 --- The Proposed Method --- p.48 / Chapter 3.3.1 --- Background Modeling --- p.48 / Chapter 3.3.2 --- Background Subtraction --- p.49 / Chapter 3.3.3 --- Foreground Object Detection --- p.52 / Chapter 3.3.4 --- Background Modeling by Dictionary Learning --- p.53 / Chapter 3.4 --- Robust Dictionary Learning --- p.54 / Chapter 3.4.1 --- Robust Sparse Coding --- p.56 / Chapter 3.4.2 --- Robust Dictionary Update --- p.57 / Chapter 3.5 --- Experimentation --- p.59 / Chapter 3.5.1 --- Local and Sudden Changes --- p.59 / Chapter 3.5.2 --- Non-structured High-frequency Changes --- p.62 / Chapter 3.5.3 --- Discussions --- p.65 / Chapter 3.6 --- Summary --- p.66 / Chapter Chapter 4 --- Plant Identification using Leaves --- p.67 / Chapter 4.1 --- Introduction --- p.68 / Chapter 4.2 --- Related Works --- p.70 / Chapter 4.3 --- Review of IDSC Feature --- p.71 / Chapter 4.4 --- The Proposed Method --- p.73 / Chapter 4.4.1 --- Independent-IDSC Feature --- p.75 / Chapter 4.4.2 --- Common Shape Patterns --- p.77 / Chapter 4.4.3 --- Leaf Representation by Counts --- p.80 / Chapter 4.4.4 --- Leaf Recognition by NN Classifier --- p.82 / Chapter 4.5 --- Experiments --- p.82 / Chapter 4.5.1 --- Settings --- p.82 / Chapter 4.5.2 --- Performance --- p.83 / Chapter 4.5.3 --- Shared Dictionaries v.s. Shared Features --- p.88 / Chapter 4.5.4 --- Pooling --- p.89 / Chapter 4.6 --- Discussions --- p.90 / Chapter 4.6.1 --- Time Complexity --- p.90 / Chapter 4.6.2 --- Space Complexity --- p.91 / Chapter 4.6.3 --- System Description --- p.92 / Chapter 4.7 --- Summary --- p.92 / Chapter 4.8 --- Acknowledgement --- p.94 / Chapter Chapter 5 --- Conclusion and Future Work --- p.95 / Chapter 5.1 --- Thesis Contributions --- p.95 / Chapter 5.2 --- Future Work --- p.97 / Chapter 5.2.1 --- Theory Side --- p.98 / Chapter 5.2.2 --- Practice Side --- p.98 / Chapter Appendix-I --- Joint Face Alignment Results --- p.100 / Bibliography --- p.107
|
414 |
Identifying Patterns in Behavioral Public Health Data Using Mixture Modeling with an Informative Number of Repeated MeasuresYu, Gary January 2014 (has links)
Finite mixture modeling is a useful statistical technique for clustering individuals based on patterns of responses. The fundamental idea of the mixture modeling approach is to assume there are latent clusters of individuals in the population which each generate their own distinct distribution of observations (multivariate or univariate) which are then mixed up together in the full population. Hence, the name mixture comes from the fact that what we observe is a mixture of distributions. The goal of this model-based clustering technique is to identify what the mixture of distributions is so that, given a particular response pattern, individuals can be clustered accordingly. Commonly, finite mixture models, as well as the special case of latent class analysis, are used on data that inherently involve repeated measures. The purpose of this dissertation is to extend the finite mixture model to allow for the number of repeated measures to be incorporated and contribute to the clustering of individuals rather than measures. The dimension of the repeated measures or simply the count of responses is assumed to follow a truncated Poisson distribution and this information can be incorporated into what we call a dimension informative finite mixture model (DIMM).
The outline of this dissertation is as follows. Paper 1 is entitled, "Dimension Informative Mixture Modeling (DIMM) for questionnaire data with an informative number of repeated measures." This paper describes the type of data structures considered and introduces the dimension informative mixture model (DIMM). A simulation study is performed to examine how well the DIMM fits the known specified truth. In the first scenario, we specify a mixture of three univariate normal distributions with different means and similar variances with different and similar counts of repeated measurements. We found that the DIMM predicts the true underlying class membership better than the traditional finite mixture model using a predicted value metric score. In the second scenario, we specify a mixture of two univariate normal distributions with the same means and variances with different and similar counts of repeated measurements. We found that that the count-informative finite mixture model predicts the truth much better than the non-informative finite mixture model.
Paper 2 is entitled, "Patterns of Physical Activity in the Northern Manhattan Study (NOMAS) Using Multivariate Finite Mixture Modeling (MFMM)." This is a study that applies a multivariate finite mixture modeling approach to examining and elucidating underlying latent clusters of different physical activity profiles based on four dimensions: total frequency of activities, average duration per activity, total energy expenditure and the total count of the number of different activities conducted. We found a five cluster solution to describe the complex patterns of physical activity levels, as measured by fifteen different physical activity items, among a US based elderly cohort. Adding in a class of individuals who were not doing any physical activity, the labels of these six clusters are: no exercise, very inactive, somewhat inactive, slightly under guidelines, meet guidelines and above guidelines. This methodology improves upon previous work which utilized only the total metabolic equivalent (a proxy of energy expenditure) to classify individuals into inactive, active and highly active.
Paper 3 is entitled, "Complex Drug Use Patterns and Associated HIV Transmission Risk Behaviors in an Internet Sample of US Men Who Have Sex With Men." This is a study that applies the count-informative information into a latent class analysis on nineteen binary drug items of drugs consumed within the past year before a sexual encounter. In addition to the individual drugs used, the mixture model incorporated a count of the total number of drugs used. We found a six class solution: low drug use, some recreational drug use, nitrite inhalants (poppers) with prescription erectile dysfunction (ED) drug use, poppers with prescription/non-prescription ED drug use and high polydrug use. Compared to participants in the low drug use class, participants in the highest drug use class were 5.5 times more likely to report unprotected anal intercourse (UAI) in their last sexual encounter and approximately 4 times more likely to report a new sexually transmitted infection (STI) in the past year. Younger men were also less likely to report UAI than older men but more likely to report an STI.
|
415 |
An integrated fuzzy rule-based image segmentation frameworkKarmakar, Gour Chandra, 1970- January 2002 (has links)
Abstract not available
|
416 |
From multitarget tracking to event recognition in videosBrendel, William 12 May 2011 (has links)
This dissertation addresses two fundamental problems in computer vision—namely,
multitarget tracking and event recognition in videos. These problems are challenging
because uncertainty may arise from a host of sources, including motion blur,
occlusions, and dynamic cluttered backgrounds. We show that these challenges can be
successfully addressed by using a multiscale, volumetric video representation, and
taking into account various constraints between events offered by domain knowledge.
The dissertation presents our two alternative approaches to multitarget tracking. The
first approach seeks to transitively link object detections across consecutive video
frames by finding the maximum independent set of a graph of all object detections.
Two maximum-independent-set algorithms are specified, and their convergence
properties theoretically analyzed. The second approach hierarchically partitions the
space-time volume of a video into tracks of objects, producing a segmentation graph of
that video. The resulting tracks encode rich contextual cues between salient video parts
in space and time, and thus facilitate event recognition, and segmentation in space and
time.
We also describe our two alternative approaches to event recognition. The first
approach seeks to learn a structural probabilistic model of an event class from training
videos represented by hierarchical segmentation graphs. The graph model is then used
for inference of event occurrences in new videos. Learning and inference algorithms
are formulated within the same framework, and their convergence rates theoretically
analyzed. The second approach to event recognition uses probabilistic first-order logic
for reasoning over continuous time intervals. We specify the syntax, learning, and
inference algorithms of this probabilistic event logic.
Qualitative and quantitative results on benchmark video datasets are also presented.
The results demonstrate that our approaches provide consistent video interpretation
with respect to acquired domain knowledge. We outperform most of the state-of-the-art
approaches on benchmark datasets. We also present our new basketball dataset that
complements existing benchmarks with new challenges. / Graduation date: 2011 / Access restricted to the OSU Community at author's request from May 12, 2011 - May 12, 2012
|
417 |
Audio-video based handwritten mathematical content recognitionVemulapalli, Smita 12 November 2012 (has links)
Recognizing handwritten mathematical content is a challenging problem, and more so when such content appears in classroom videos. However, given the fact that in such videos the handwritten text and the accompanying audio refer to the same content, a combination of video and audio based recognizer has the potential to significantly improve the content recognition accuracy. This dissertation, using a combination of video and audio based recognizers, focuses on improving the recognition accuracy associated with handwritten mathematical content in such videos.
Our approach makes use of a video recognizer as the primary recognizer and a multi-stage assembly, developed as part of this research, is used to facilitate effective combination with an audio recognizer. Specifically, we address the following challenges related to audio-video based handwritten mathematical content recognition: (1) Video Preprocessing - generates a timestamped sequence of segmented characters from the classroom video in the face of occlusions and shadows caused by the instructor, (2) Ambiguity Detection - determines the subset of input characters that may have been incorrectly recognized by the video based recognizer and forwards this subset for disambiguation, (3) A/V Synchronization - establishes correspondence between the handwritten character and the spoken content, (4) A/V Combination - combines the synchronized outputs from the video and audio based recognizers and generates the final recognized character, and (5) Grammar Assisted A/V Based Mathematical Content Recognition - utilizes a base mathematical speech grammar for both character and structure disambiguation. Experiments conducted using videos recorded in a classroom-like environment demonstrate the significant improvements in recognition accuracy that can be achieved using our techniques.
|
418 |
Representing and Recognizing Temporal SequencesShi, Yifan 15 August 2006 (has links)
Activity recognition falls in general area of pattern recognition, but it resides mainly in temporal domain which leads to distinctive characteristics. We provide an extensive survey over existing tools including FSM, HMM, BNT, DBN, SCFG and Symbolic Network Approach (PNF-network). These tools are inefficient to meet many of the requirements of activity recognition, leading to this work to develop a new graphical model: Propagation Net (P-Net).
Many activities can be represented by a partially ordered set of temporal intervals, each of which corresponds to a primitive motion. Each interval has both temporal and logical constraints that control the duration of the interval and its relationship with other intervals. P-Net takes advantage of such fundamental constraints that it provides an graphical conceptual model to describe the human knowledge and an efficient computational model to facilitate recognition and learning.
P-Nets define an exponentially large joint distribution that standard bayesian inference cannot handle. We devise two approximation algorithms to interpret a multi-dimensional observation sequence of evidence as a multi-stream propagation process through P-Net. First, Local Maximal Search Algorithm (LMSA) is constructed with polynomial complexity; Second, we introduce a particle filter based framework, Discrete Condensation (D-Condensation) algorithm, which samples the discrete state space more efficiently then original Condensation.
To construct a P-Net based system, we need two parts: P-Net and the corresponding detector set. Given topology information and detector library, P-Net parameters can be extracted easily from a relatively small number of positive examples. To avoid the tedious process of manually constructing the detector library, we introduce semi-supervised learning framework to build P-Net and the corresponding detectors together. Furthermore, we introduce the Contrast Boosting algorithm that forces the detectors to be as different as possible but not necessary to be non-overlapping.
The classification and learning ability of P-Nets are verified on three data sets: 1)vision tracked indoor activity data set; 2)vision tracked glucose monitor calibration data set; 3)sensor data set on simple weight-lifting exercise. Comparison with standard SCFG and HMM prove a P-Net based system is easier to construct and has a superior ability to classify complex human activity and detect anomaly.
|
419 |
Physiologically Motivated Methods For Audio Pattern ClassificationRavindran, Sourabh 20 November 2006 (has links)
Human-like performance by machines in tasks of speech and audio processing has remained an elusive goal. In an attempt to bridge the gap in performance between humans and machines there has been an increased effort to study and model physiological processes. However, the widespread use of biologically inspired features proposed in the past has been hampered mainly by either the lack of robustness across a range of signal-to-noise ratios or the formidable computational costs. In physiological systems, sensor processing occurs in several stages. It is likely the case that signal features and biological processing techniques evolved together and are complementary or well matched. It is precisely for this reason that modeling the feature extraction processes should go hand in hand with modeling of the processes that use these features. This research presents a front-end feature extraction method for audio signals inspired by the human peripheral auditory system. New developments in the field of machine learning are leveraged to build classifiers to maximize the performance gains afforded by these features. The structure of the classification system is similar to what might be expected in physiological processing. Further, the feature extraction and classification algorithms can be efficiently implemented using the low-power cooperative analog-digital signal processing platform. The usefulness of the features is demonstrated for tasks of audio classification, speech versus non-speech discrimination, and speech recognition. The low-power nature of the classification system makes it ideal for use in applications such as hearing aids, hand-held devices, and surveillance through acoustic scene monitoring
|
420 |
Statistical methods for feature extraction in shape analysis and bioinformaticsLe Faucheur, Xavier Jean Maurice 05 April 2010 (has links)
The presented research explores two different problems of statistical data analysis.
In the first part of this thesis, a method for 3D shape representation, compression and smoothing is presented. First, a technique for encoding non-spherical surfaces using second generation wavelet decomposition is described. Second, a novel model is proposed for wavelet-based surface enhancement. This part of the work aims to develop an efficient algorithm for removing irrelevant and noise-like variations from 3D shapes. Surfaces are encoded using second generation wavelets, and the proposed methodology consists of separating noise-like wavelet coefficients from those contributing to the relevant part of the signal. The empirical-based Bayesian models developed in this thesis threshold wavelet coefficients in an adaptive and robust manner. Once thresholding is performed, irrelevant coefficients are removed and the inverse wavelet transform is applied to the clean set of wavelet coefficients. Experimental results show the efficiency of the proposed technique for surface smoothing and compression.
The second part of this thesis proposes using a non-parametric clustering method for studying RNA (RiboNucleic Acid) conformations. The local conformation of RNA molecules is an important factor in determining their catalytic and binding properties. RNA conformations can be characterized by a finite set of parameters that define the local arrangement of the molecule in space. Their analysis is particularly difficult due to the large number of degrees of freedom, such as torsion angles and inter-atomic distances among interacting residues. In order to understand and analyze the structural variability of RNA molecules, this work proposes a methodology for detecting repetitive conformational sub-structures along RNA strands. Clusters of similar structures in the conformational space are obtained using a nearest-neighbor search method based on the statistical mechanical Potts model. The proposed technique is a mostly automatic clustering algorithm and may be applied to problems where there is no prior knowledge on the structure of the data space, in contrast to many other clustering techniques. First, results are reported for both single residue conformations- where the parameter set of the data space includes four to seven torsional angles-, and base pair geometries. For both types of data sets, a very good match is observed between the results of the proposed clustering method and other known classifications, with only few exceptions. Second, new results are reported for base stacking geometries. In this case, the proposed classification is validated with respect to specific geometrical constraints, while the content and geometry of the new clusters are fully analyzed.
|
Page generated in 0.1562 seconds