• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 8
  • 2
  • 1
  • Tagged with
  • 16
  • 16
  • 5
  • 5
  • 4
  • 4
  • 3
  • 3
  • 3
  • 3
  • 3
  • 2
  • 2
  • 2
  • 2
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Video annotation tools

Chaudhary, Ahmed 10 October 2008 (has links)
This research deals with annotations in scholarly work. Annotations have been studied by many people. A significant amount of research has shown that instead of implementing domain specific annotation applications a better approach is to develop general purpose annotation toolkits that can be used to create domain specific applications. A video annotation toolkit along with toolkits for searching, retrieving, analyzing and presenting videos can help achieve the broader goal of creating integrated work spaces for scholarly work in humanities research similar to existing environments in such fields as mathematics, engineering, statistics, software development and bioinformatics. This research implements a video annotation toolkit and evaluates it by looking at its usefulness in creating applications for different areas. It was found that many areas of study in the arts and sciences can benefit from a video annotation application tailored to their specific needs and that an annotation toolkit can significantly reduce the time for developing such applications. The toolkit was engineered through successive refinements of prototype applications developed for different application areas. The toolkit design was also guided by a set of features identified by the research community for an ideal general purpose annotation toolkit. This research contributes by combining these two different approaches to toolkit design and construction into a hybrid approach. This approach could be useful for similar or related efforts.
2

Video annotation tools

Chaudhary, Ahmed 10 October 2008 (has links)
This research deals with annotations in scholarly work. Annotations have been studied by many people. A significant amount of research has shown that instead of implementing domain specific annotation applications a better approach is to develop general purpose annotation toolkits that can be used to create domain specific applications. A video annotation toolkit along with toolkits for searching, retrieving, analyzing and presenting videos can help achieve the broader goal of creating integrated work spaces for scholarly work in humanities research similar to existing environments in such fields as mathematics, engineering, statistics, software development and bioinformatics. This research implements a video annotation toolkit and evaluates it by looking at its usefulness in creating applications for different areas. It was found that many areas of study in the arts and sciences can benefit from a video annotation application tailored to their specific needs and that an annotation toolkit can significantly reduce the time for developing such applications. The toolkit was engineered through successive refinements of prototype applications developed for different application areas. The toolkit design was also guided by a set of features identified by the research community for an ideal general purpose annotation toolkit. This research contributes by combining these two different approaches to toolkit design and construction into a hybrid approach. This approach could be useful for similar or related efforts.
3

Interactive video retrieval using implicit user feedback

Vrochidis, Stefanos January 2013 (has links)
In the recent years, the rapid development of digital technologies and the low cost of recording media have led to a great increase in the availability of multimedia content worldwide. This availability places the demand for the development of advanced search engines. Traditionally, manual annotation of video was one of the usual practices to support retrieval. However, the vast amounts of multimedia content make such practices very expensive in terms of human effort. At the same time, the availability of low cost wearable sensors delivers a plethora of user-machine interaction data. Therefore, there is an important challenge of exploiting implicit user feedback (such as user navigation patterns and eye movements) during interactive multimedia retrieval sessions with a view to improving video search engines. In this thesis, we focus on automatically annotating video content by exploiting aggregated implicit feedback of past users expressed as click-through data and gaze movements. Towards this goal, we have conducted interactive video retrieval experiments, in order to collect click-through and eye movement data in not strictly controlled environments. First, we generate semantic relations between the multimedia items by proposing a graph representation of aggregated past interaction data and exploit them to generate recommendations, as well as to improve content-based search. Then, we investigate the role of user gaze movements in interactive video retrieval and propose a methodology for inferring user interest by employing support vector machines and gaze movement-based features. Finally, we propose an automatic video annotation framework, which combines query clustering into topics by constructing gaze movement-driven random forests and temporally enhanced dominant sets, as well as video shot classification for predicting the relevance of viewed items with respect to a topic. The results show that exploiting heterogeneous implicit feedback from past users is of added value for future users of interactive video retrieval systems.
4

Machine learning architectures for video annotation and retrieval

Markatopoulou, Foteini January 2018 (has links)
In this thesis we are designing machine learning methodologies for solving the problem of video annotation and retrieval using either pre-defined semantic concepts or ad-hoc queries. Concept-based video annotation refers to the annotation of video fragments with one or more semantic concepts (e.g. hand, sky, running), chosen from a predefined concept list. Ad-hoc queries refer to textual descriptions that may contain objects, activities, locations etc., and combinations of the former. Our contributions are: i) A thorough analysis on extending and using different local descriptors towards improved concept-based video annotation and a stacking architecture that uses in the first layer, concept classifiers trained on local descriptors and improves their prediction accuracy by implicitly capturing concept relations, in the last layer of the stack. ii) A cascade architecture that orders and combines many classifiers, trained on different visual descriptors, for the same concept. iii) A deep learning architecture that exploits concept relations at two different levels. At the first level, we build on ideas from multi-task learning, and propose an approach to learn concept-specific representations that are sparse, linear combinations of representations of latent concepts. At a second level, we build on ideas from structured output learning, and propose the introduction, at training time, of a new cost term that explicitly models the correlations between the concepts. By doing so, we explicitly model the structure in the output space (i.e., the concept labels). iv) A fully-automatic ad-hoc video search architecture that combines concept-based video annotation and textual query analysis, and transforms concept-based keyframe and query representations into a common semantic embedding space. Our architectures have been extensively evaluated on the TRECVID SIN 2013, the TRECVID AVS 2016, and other large-scale datasets presenting their effectiveness compared to other similar approaches.
5

Semantics of Video Shots for Content-based Retrieval

Volkmer, Timo, timovolkmer@gmx.net January 2007 (has links)
Content-based video retrieval research combines expertise from many different areas, such as signal processing, machine learning, pattern recognition, and computer vision. As video extends into both the spatial and the temporal domain, we require techniques for the temporal decomposition of footage so that specific content can be accessed. This content may then be semantically classified - ideally in an automated process - to enable filtering, browsing, and searching. An important aspect that must be considered is that pictorial representation of information may be interpreted differently by individual users because it is less specific than its textual representation. In this thesis, we address several fundamental issues of content-based video retrieval for effective handling of digital footage. Temporal segmentation, the common first step in handling digital video, is the decomposition of video streams into smaller, semantically coherent entities. This is usually performed by detecting the transitions that separate single camera takes. While abrupt transitions - cuts - can be detected relatively well with existing techniques, effective detection of gradual transitions remains difficult. We present our approach to temporal video segmentation, proposing a novel algorithm that evaluates sets of frames using a relatively simple histogram feature. Our technique has been shown to range among the best existing shot segmentation algorithms in large-scale evaluations. The next step is semantic classification of each video segment to generate an index for content-based retrieval in video databases. Machine learning techniques can be applied effectively to classify video content. However, these techniques require manually classified examples for training before automatic classification of unseen content can be carried out. Manually classifying training examples is not trivial because of the implied ambiguity of visual content. We propose an unsupervised learning approach based on latent class modelling in which we obtain multiple judgements per video shot and model the users' response behaviour over a large collection of shots. This technique yields a more generic classification of the visual content. Moreover, it enables the quality assessment of the classification, and maximises the number of training examples by resolving disagreement. We apply this approach to data from a large-scale, collaborative annotation effort and present ways to improve the effectiveness for manual annotation of visual content by better design and specification of the process. Automatic speech recognition techniques along with semantic classification of video content can be used to implement video search using textual queries. This requires the application of text search techniques to video and the combination of different information sources. We explore several text-based query expansion techniques for speech-based video retrieval, and propose a fusion method to improve overall effectiveness. To combine both text and visual search approaches, we explore a fusion technique that combines spoken information and visual information using semantic keywords automatically assigned to the footage based on the visual content. The techniques that we propose help to facilitate effective content-based video retrieval and highlight the importance of considering different user interpretations of visual content. This allows better understanding of video content and a more holistic approach to multimedia retrieval in the future.
6

AnnotEasy: A gesture and speech-to-text based video annotation tool for note taking in pre-recorded lectures in higher education

Uggerud, Nils January 2021 (has links)
This paper investigates students’ attitudes towards using gestures and speech-to- text (GaST) to take notes while watching recorded lectures. A literature review regarding video based learning, an expert interview, and a background survey regarding students’ note taking habits led to the creation of the prototype AnnotEasy, a tool that allows students to use GaST to take notes. AnnotEasy was tested in three iterations against 18 students, and was updated after each iteration.  The students watched a five minute long lecture and took notes by using AnnotEasy. The participants’ perceived ease of use (PEU) and perceived usefulness (PU) was evaluated based on the TAM. Their general attitudes were evaluated in semi structured interviews.  The result showed that the students had a high PEU and PU of AnnotEasy. They were mainly positive towards taking notes by using GaST. Further, the result suggests that AnnotEasy could facilitate the process of structuring a lecture’s content. Lastly, even though students had positive attitudes towards using speech to create notes, observations showed that this was problematic when the users attempted to create longer notes. This indicates that speech could be more beneficial for taking shorter notes.
7

Exploiting Information Extraction Techniques For Automatic Semantic Annotation And Retrieval Of News Videos In Turkish

Kucuk, Dilek 01 February 2011 (has links) (PDF)
Information extraction (IE) is known to be an effective technique for automatic semantic indexing of news texts. In this study, we propose a text-based fully automated system for the semantic annotation and retrieval of news videos in Turkish which exploits several IE techniques on the video texts. The IE techniques employed by the system include named entity recognition, automatic hyperlinking, person entity extraction with coreference resolution, and event extraction. The system utilizes the outputs of the components implementing these IE techniques as the semantic annotations for the underlying news video archives. Apart from the IE components, the proposed system comprises a news video database in addition to components for news story segmentation, sliding text recognition, and semantic video retrieval. We also propose a semi-automatic counterpart of system where the only manual intervention takes place during text extraction. Both systems are executed on genuine video data sets consisting of videos broadcasted by Turkish Radio and Television Corporation. The current study is significant as it proposes the first fully automated system to facilitate semantic annotation and retrieval of news videos in Turkish, yet the proposed system and its semi-automated counterpart are quite generic and hence they could be customized to build similar systems for video archives in other languages as well. Moreover, IE research on Turkish texts is known to be rare and within the course of this study, we have proposed and implemented novel techniques for several IE tasks on Turkish texts. As an application example, we have demonstrated the utilization of the implemented IE components to facilitate multilingual video retrieval.
8

Semi-automatic Semantic Video Annotation Tool

Aydinlilar, Merve 01 December 2011 (has links) (PDF)
Semantic annotation of video content is necessary for indexing and retrieval tasks of video management systems. Currently, it is not possible to extract all high-level semantic information from video data automatically. Video annotation tools assist users to generate annotations to represent video data. Generated annotations can also be used for testing and evaluation of content based retrieval systems. In this study, a semi-automatic semantic video annotation tool is presented. Generated annotations are in MPEG-7 metadata format to ensure interoperability. With the help of image processing and pattern recognition solutions, annotation process is partly automated and annotation time is reduced. Annotations can be done for spatio-temporal decompositions of video data. Extraction of low-level visual descriptions are included to obtain complete descriptions.
9

How to annotate in video for training machine learning with a good workflow

Jakob, Persson January 2021 (has links)
Artificial intelligence and machine learning is used in a lot of different areas, one of those areas is image recognition. In the production of a TV-show or film, image recognition can be used to help the editors to find specific objects, scenes, or people in the video content, which speeds up the production. But image recognition is not working perfect all the time and can not be used in the production of a TV-show or film as it is intended to. Therefore the image recognition algorithms needs to be trained on large datasets to become better. But to create these datasets takes time and tools that can let users create specific datasets and retrain algorithms to become better is needed. The aim of this master thesis was to investigate if it was possible to create a tool that can annotate objects and people in video content and using the data as training sets, and a tool that can retrain the output of an image recognition to make the image recognition become better. It was also important that the tools have a good workflow for the users. The study consisted of a theoretical study to gain more knowledge about annotation, and how to make a good UX-design with a good workflow. Interviews were also held to get more knowledge of what the requirements of the product was. It resulted in a user scenario and a workflow that was used together with the knowledge from the theoretical study to create a hi-fi prototype by using an iterative process with usability testing. This resulted in a final hi-fi prototype with a good design and a good workflow for the users, where it is possible to annotate objects and people with a bounding box, and where it is possible to retrain an image recognition program that has been used on video content. / Artificiell intelligens och maskininlärning används inom många olika områden, ett av dessa områden är bildigenkänning. Vid produktionen av ett TV-program eller av en film kan bildigenkänning användas för att hjälpa redigerarna att hitta specifika objekt, scener eller personer i videoinnehållet, vilket påskyndar produktionen. Men bildigenkänningsprogram fungerar inte alltid helt perfekt och kan inte användas i produktionen av ett TV-program eller film som det är tänkt att användas i det sammanhanget. För att förbättra bildigenkänningsprogram så behöver dess algoritm tränas på stora datasets av bilder och labels. Men att skapa dessa datasets tar tid och det behövs program som kan skapa datasets och återträna algoritmer för bildigenkänning så att de fungerar bättre. Syftet med detta examensarbete var att undersöka om det var möjligt att skapa ett verktyg som kan markera(annotera) objekt och personer i video och använda datat som träningsdata för algoritmer. Men även att skapa ett verktyg som kan återträna algoritmer för bildigenkänning så att de blir bättre utifrån datat man får från ett bildigenkänningprogram. Det var också viktigt att dessa verktyg hade ett bra arbetsflöde för användarna. Studien bestod av en teoretisk studie för att få mer kunskap om annoteringar i video och hur man skapar bra UX-design med ett bra arbetsflöde. Intervjuer hölls också för att få mer kunskap om kraven på produkten och vilka som skulle använda den. Det resulterade i ett användarscenario och ett arbetsflöde som användes tillsammans med kunskapen från den teoretiska studien för att skapa en hi-fi prototyp, där en iterativ process med användbarhetstestning användes. Detta resulterade i en slutlig hi-fi prototyp med bra design och ett bra arbetsflöde för användarna där det är möjligt att markera(annotera) objekt och personer med en bounding box och där det är möjligt att återträna algoritmer för bildigenkänning som har körts på video.
10

Anotace obrazu a videa formou hry / Image and Video Annotation as a Game

Skowronek, Ondej January 2014 (has links)
This master thesis is oriented on a problem of creating video and image annotations. This problem is solved by crowdsourcing approach. Crowdsourcing games were designed and implemented to make solution of this problem . It was proven by testing that these games are capable of creating high quality annotations. Launching these games on a larger scale could create large database of annotated videos and images.

Page generated in 0.1074 seconds