Global ETD Search

1	Automatic detection of shot boundaries in digital video Yusoff, Yusseri January 2002 (has links) This thesis describes the implementation of automatic shot boundary detection algorithms for the detection of cuts and gradual transitions in digital video sequences. The objective was to develop a fully automatic video segmentation system as a pre-processing step for video database retrieval management systems as well as other applications which has large video sequences as part of their systems. For die detection of cuts, we begin by looking at a set of baseline algorithms that look into measuring specific features of video images and calculating the dissimilarity of the measures between frames in the video sequence. We then propose two different approaches and compare them against the set of baseline algorithms. These approaches are themselves built upon the base set of algorithms. Observing that the baseline algorithms initially use hard thresholds to determine shot boundaries, we build Receiver Operating Characteristic (ROC) curves to plot the characteristics of the algorithms when varying the thresholds. In the first approach, we look into combining the multiple algorithms in such a way that as a collective, the detection of cuts are improved. The results of the fusion are then compared against the baseline algorithms on the ROC curve. For the second approach, we look into having adaptive thresholds for the baseline algorithms. A selection of adaptive thresholding methods were applied to the data set and compared with the baseline algorithms that are using hard thresholds. In the case of gradual transition detection, an application of a filtering technique used to detect ramp edges in images is adapted for use in video sequences. The approach is taken by starting with the observation that shot boundaries represent edges in time, with cuts being sharp edges and gradual transitions closely approximating ramp edges. The methods that we propose reflect our concentration on producing a reliable and efficient shot boundary detection mechanism. In each instance, be it for cuts or gradual transitions, we tested our algorithms on a comprehensive set of video sequences, containing a variety of content and obtained highly competitive results. Read more 621 Video database indexing and retrieval
2	Design of Indexing Strategies for Video Database System Chen, You-cheng 29 June 2005 (has links) In the video database, each video contains temporal and spatial relationships between content objects. The temporal relationships can be specified between frame sequences and the spatial relationships can be specified by the relationships between objects in a single frame. Moreover, the information related to locations and motions of objects is included in video database. Many video indexing strategies have been proposed, which include the above information to speed up the query processing time. For example, the 3D C-string strategy, it uses the projections of objects to represent spatial and temporal relations between objects in a video. Moreover, the 3D C-string strategy can keep track of the motions and size changes of the objects in a video. However, there are three problems caused by the 3D C-string strategy. The first one is that it cannot index some kinds of videos in which an object appears and then disappears for more than one time. The second one is that the representation of the 3D C-string is too complex for deriving spatial relationships. The last one is that the 3D C-string cannot derive the absolute locations of objects, since it records the relative locations of objects. In this thesis, in order to solve the problems of the 3D C-string strategy, we propose three new spatial relationships. By making use of the three spatial relationships, we can express the condition that objects disappear and appear. Moreover, based on the sequence of spatial relationships, we can derive the temporal relationships. Based on this technique, we propose three index processing strategies for video database. The first strategy is the Temporal UID Matrix (TUID) strategy. We use those 13 unique numbers used in the UID strategy and our 3 new added unique numbers to represent spatial relationships. Then, we store the sequence of spatial relationships in the TUID matrix. In this way, we can efficiently support query types of spatial, temporal, and spatio-temporal relationships. However, since the TUID strategy does not record the information of objects, it cannot support the query type by the information of objects. Therefore, we propose the second strategy, the 2D Video String strategy, to keep track of the motions, locations, and size changes associated with the video objects. Although the 2D Video String strategy can support all types of queries, it is less efficient than the TUID strategy. By making use of the advantages of both strategies, we propose another video indexing strategy, the Hybrid strategy. We record the information of objects in the diagonal part of the TUID matrix. From our simulation study, we show that our proposed strategies can provide a shorter search time for video data than Lee et al.'s 3D C-string strategy, except the 2D Video String strategy for the temporal query. Read more Video Database The motions of objects Temporal Relationships Spatial-Temporal Relationships Spatial Relationships
3	An Mpeg-7 Video Database System For Content-based Management And Retrieval Celik, Cigdem 01 October 2005 (has links) (PDF) A video data model that allows efficient and effective representation and querying of spatio-temporal properties of objects has been previously developed. The data model is focused on the semantic content of video streams. Objects, events, activities performed by objects are the main interests of the model. The model supports fuzzy spatial queries including querying spatial relationships between objects and querying the trajectories of objects. In this thesis, this work is used as a basis for the development of an XML-based video database system. This system is aimed to be compliant with the MPEG-7 Multimedia Description Schemes in order to obey a universal standard. The system is implemented using a native XML database management system. Query entrance facilities are enhanced via integrating an NLP interface.
4	Recurrent neural networks for deception detection in videos Rodriguez-Meza, Bryan, Vargas-Lopez-Lavalle, Renzo, Ugarte, Willy 01 January 2022 (has links) Deception detection has always been of subject of interest. After all, determining if a person is telling the truth or not could be detrimental in many real-world cases. Current methods to discern deceptions require expensive equipment that need specialists to read and interpret them. In this article, we carry out an exhaustive comparison between 9 different facial landmark recognition based recurrent deep learning models trained on a recent man-made database used to determine lies, comparing them by accuracy and AUC. We also propose two new metrics that represent the validity of each prediction. The results of a 5-fold cross validation show that out of all the tested models, the Stacked GRU neural model has the highest AUC of.9853 and the highest accuracy of 93.69% between the trained models. Then, a comparison is done between other machine and deep learning methods and our proposed Stacked GRU architecture where the latter surpasses them in the AUC metric. These results indicate that we are not that far away from a future where deception detection could be accessible throughout computers or smart devices. / Revisión por pares Deception detection Deep learning Facial landmarks recognition Recurrent neural networks Video database
5	Scene Understanding For Real Time Processing Of Queries Over Big Data Streaming Video Aved, Alexander 01 January 2013 (has links) With heightened security concerns across the globe and the increasing need to monitor, preserve and protect infrastructure and public spaces to ensure proper operation, quality assurance and safety, numerous video cameras have been deployed. Accordingly, they also need to be monitored effectively and efficiently. However, relying on human operators to constantly monitor all the video streams is not scalable or cost effective. Humans can become subjective, fatigued, even exhibit bias and it is difficult to maintain high levels of vigilance when capturing, searching and recognizing events that occur infrequently or in isolation. These limitations are addressed in the Live Video Database Management System (LVDBMS), a framework for managing and processing live motion imagery data. It enables rapid development of video surveillance software much like traditional database applications are developed today. Such developed video stream processing applications and ad hoc queries are able to "reuse" advanced image processing techniques that have been developed. This results in lower software development and maintenance costs. Furthermore, the LVDBMS can be intensively tested to ensure consistent quality across all associated video database applications. Its intrinsic privacy framework facilitates a formalized approach to the specification and enforcement of verifiable privacy policies. This is an important step towards enabling a general privacy certification for video surveillance systems by leveraging a standardized privacy specification language. With the potential to impact many important fields ranging from security and assembly line monitoring to wildlife studies and the environment, the broader impact of this work is clear. The privacy framework protects the general public from abusive use of surveillance technology; iii success in addressing the "trust" issue will enable many new surveillance-related applications. Although this research focuses on video surveillance, the proposed framework has the potential to support many video-based analytical applications. Read more Query language privacy framework video database system real time object recognition object tracking video stream Computer Sciences Engineering
6	An Intelligent Fuzzy Object-oriented Database Framework For Video Database Applications Ozgur, Nezihe Burcu 01 October 2007 (has links) (PDF) Video database applications call for flexible and powerful modeling and querying facilities, which require an integration or interaction between database and knowledge base technologies. It is also necessary for many real life video database applications to incorporate uncertainty, which naturally occurs due to the complex and subjective semantic content of video data. In this thesis study, firstly, a fuzzy conceptual data model is introduced to represent the semantic content of video data. UML (Unified Modeling Language) is utilized and extended to represent uncertain information along with video specific properties at the conceptual level. Secondly, an intelligent fuzzy object-oriented database framework is presented for video database applications. The introduced fuzzy conceptual model is mapped to the presented framework, which is an adaptation of the previously proposed IFOOD architecture. The framework provides modeling and querying of complex and rich semantic content and knowledge of video data including uncertainty. Moreover, it allows (fuzzy) semantic, temporal, (fuzzy) spatial, hierarchical, regional and trajectory queries, based on the video data model. We think that the presented conceptual data model and framework can be adapted to any application domain related to video databases. Read more QA Computer Software 76.75-76.765
7	Multi-viewpoint lane detection with applications in driver safety systems Borkar, Amol 19 December 2011 (has links) The objective of this dissertation is to develop a Multi-Camera Lane Departure Warning (MCLDW) system and a framework to evaluate it. A Lane Departure Warning (LDW) system is a safety feature that is included in a few luxury automobiles. Using a single camera, it performs the task of informing the driver if a lane change is imminent. The core component of an LDW system is a lane detector, whose objective is to find lane markers on the road. Therefore, we start this dissertation by explaining the requirements of an ideal lane detector, and then present several algorithmic implementations that meet these requirements. After selecting the best implementation, we present the MCLDW methodology. Using a multi-camera setup, MCLDW system combines the detected lane marker information from each camera's view to estimate the immediate distance between the vehicle and the lane marker, and signals a warning if this distance is under a certain threshold. Next, we introduce a procedure to create ground truth and a database of videos which serve as the framework for evaluation. Ground truth is created using an efficient procedure called Time-Slicing that allows the user to quickly annotate the true locations of the lane markers in each frame of the videos. Subsequently, we describe the details of a database of driving videos that has been put together to help establish a benchmark for evaluating existing lane detectors and LDW systems. Finally, we conclude the dissertation by citing the contributions of the research and discussing the avenues for future work. Read more Driver safety Ground truth Driver assistance system Driving video database Lane departure warning Lane detection Lane lines (Roads) Traffic lanes Road markings
8	Rozpoznávání obličejů ve videosekvencích / Face recognition in video sequences Malach, Tobiáš January 2013 (has links) This thesis deals with design, implementation and testing of face recognition system processing video sequences captured by CCTV systems. The use of Local Binary Pattern Histograms (LPBH) and Nearest Neighbor (NN) classifier was suggested according to the survey of face recognition methods. Discrimination power of LBPH features was examined and individual informative features were searched based on Fisher discrimination ratio and mutual correlation. Cluster’s centorid method was utilized for pattern creation because of its best effect on system’s face recognition capability comparing several proposed methods. Software tool for effective face recognition system algorithms performance testing was developed. Video database IFaViD was assembled for training and performance testing of implemented face recognition system.

Search results