Spelling suggestions: "subject:"multimedia information retrieval"" "subject:"nultimedia information retrieval""
1 |
Multimedia Data Mining and Retrieval for Multimedia Databases Using Associations and CorrelationsLin, Lin 23 June 2010 (has links)
With the explosion in the complexity and amount of pervasive multimedia data, there are high demands of multimedia services and applications in various areas for people to easily access and distribute multimedia data. Facing with abundance multimedia resources but inefficient and rather old-fashioned keyword-based information retrieval approaches, a content-based multimedia information retrieval (CBMIR) system is required to (i) reduce the dimension space for storage saving and computation reduction; (ii) advance multimedia learning methods to accurately identify target semantics for bridging the semantics between low-level/mid-level features and high-level semantics; and (iii) effectively search media content for dynamical media delivery and enable the extensive applications to be media-type driven. This research mainly focuses on multimedia data mining and retrieval system for multimedia databases by addressing some main challenges, such as data imbalance, data quality, semantic gap, user subjectivity and searching issues. Therefore, a novel CBMIR system is proposed in this dissertation. The proposed system utilizes both association rule mining (ARM) technique and multiple correspondence analysis (MCA) technique by taking into account both pattern discovery and statistical analysis. First, media content is represented by the global and local low-level and mid-level features and stored in the multimedia database. Second, a data filtering component is proposed in the system to improve the data quality and reduce the data imbalance. To be specific, the proposed filtering step is able to vertically select features and horizontally prune instances in multimedia databases. Third, a new learning and classification method mining weighted association rules is proposed in the retrieval system. The MCA-based correlation is used to generate and select the weighted N-feature-value pair rules, where the N varies from one to many. Forth, a ranking method independent of classifiers is proposed in the system to sort the retrieved results and put the most interesting ones on the top of the browsing list. Finally, a user interface is implemented in CBMIR system that allows the user to choose his/her interested concept, searches media based on the target concept, ranks the retrieved segments using the proposed ranking algorithm, and then displays the top-ranked segments to the user. The system is experimented with various high-level semantics from TRECVID benchmark data sets. TRECVID sound and vision data is a large data set, includes various types of videos, and has very rich semantics. Overall, the proposed system achieves promising results in comparison with the other well-known methods. Moreover, experiments that compare each component with some other famous algorithms are conducted. The experimental results show that all proposed components improve the functionalities of the CBMIR system, and the proposed system reaches effectiveness, robustness and efficiency for a high-dimensional multimedia database.
|
2 |
Ähnlichkeitssuche in Multimedia-Datenbanken Retrieval, Suchalgorithmen und AnfragebehandlungSchmitt, Ingo January 2004 (has links)
Zugl.: Magdeburg, Univ., Habil-Schr., 2004
|
3 |
Nouvelles méthodes pour la recherche sémantique et esthétique d'informations multimédia / Novel methods for semantic and aesthetic multimedia retrievalRedi, Miriam 29 May 2013 (has links)
A l'ère d'Internet, la classification informatisée des images est d'une importance cruciale pour l’utilisation efficace de l'énorme quantité de données visuelles qui sont disponibles. Mais comment les ordinateurs peuvent-ils comprendre la signification d'une image? La Recherche d’Information Multimédia (RIM) est un domaine de recherche qui vise à construire des systèmes capables de reconnaître automatiquement le contenu d’une image. D'abord, des caractéristiques de bas niveau sont extraites et regroupées en signatures visuelles compactes. Ensuite, des techniques d'apprentissage automatique construisent des modèles qui font la distinction entre les différentes catégories d'images à partir de ces signatures. Ces modèles sont finalement utilisés pour reconnaître les propriétés d'une nouvelle image. Malgré les progrès dans le domaine, ces systèmes ont des performances en général limitées. Dans cette thèse, nous concevons un ensemble de contributions originales pour chaque étape de la chaîne RIM, en explorant des techniques provenant d'une variété de domaines qui ne sont pas traditionnellement liés avec le MMIR. Par exemple, nous empruntons la notion de saillance et l'utilisons pour construire des caractéristiques de bas niveau. Nous employons la théorie des Copulae étudiée en statistique économique, pour l'agrégation des caractéristiques. Nous réutilisons la notion de pertinence graduée, populaire dans le classement des pages Web, pour la récupération visuelle. Le manuscrit détaille nos solutions novatrices et montre leur efficacité pour la catégorisation d'image et de vidéo, et l’évaluation de l'esthétique. / In the internet era, computerized classification and discovery of image properties (objects, scene, emotions generated, aesthetic traits) is of crucial importance for the automatic retrieval of the huge amount of visual data surrounding us. But how can computers see the meaning of an image? Multimedia Information Retrieval (MMIR) is a research field that helps building intelligent systems that automatically recognize the image content and its characteristics. In general, this is achieved by following a chain process: first, low-level features are extracted and pooled into compact image signatures. Then, machine learning techniques are used to build models able to distinguish between different image categories based on such signatures. Such model will be finally used to recognize the properties of a new image. Despite the advances in the field, human vision systems still substantially outperform their computer-based counterparts. In this thesis we therefore design a set of novel contributions for each step of the MMIR chain, aiming at improving the global recognition performances. In our work, we explore techniques from a variety of fields that are not traditionally related with Multimedia Retrieval, and embed them into effective MMIR frameworks. For example, we borrow the concept of image saliency from visual perception, and use it to build low-level features. We employ the Copula theory of economic statistics for feature aggregation. We re-use the notion of graded relevance, popular in web page ranking, for visual retrieval frameworks. We explain in detail our novel solutions and prove their effectiveness for image categorization, video retrieval and aesthetics assessment.
|
4 |
Effiziente Strategien und Werkzeuge zur Generierung und Verwaltung von e-Learning-SystemenUesbeck, Mechthild. Unknown Date (has links) (PDF)
Universiẗat, Diss., 2001--Tübingen.
|
5 |
Mixed-initiative multimedia for mobile devices: design of a semantically relevant low latency system for news video recommendationsLee, Jeannie Su Ann 12 July 2010 (has links)
The increasing ubiquity of networked mobile devices such as cell phones and PDAs has created new opportunities for the transmission and display of multimedia content. However, any mobile device has inherent resource constraints: low network bandwidth, small screen sizes, limited input methods, and low commitment viewing. Mobile systems that provide information display and access thus need to mitigate these various constraints. Despite progress in information retrieval and content recommendation, there has been less focus on issues arising from a network-oriented and mobile perspective.
This dissertation investigates a coordinated design approach to networked multimedia on mobile devices, and considers the abovementioned system perspectives. Within the context of accessing news video on mobile devices, the goal is to provide a cognitively palatable stream of videos and a seamless, low-latency user experience. Mixed-initiative---a method whereby intelligent services and users collaborate efficiently to achieve the user's goals, is the cornerstone of the system design and integrates user relevance feedback with a content recommendation engine and a content- and network-aware video buffer prefetching technique. These various components have otherwise been considered independently in other prior system designs.
To overcome limited interactivity, a mixed-initiative user interface was used to present a sequence of news video clips to the user, along with operations to vote-up or vote-down a video to indicate its relevance. On-screen gesture equivalents of these operations were also implemented to reduce user interface elements occupying the screen. Semantic relevancy was then improved by extracting and indexing the content of each video clip as text features, and using a Na"ive Bayesian content recommendation strategy that harnessed the user relevance feedback to tailor the subsequent video recommendations. With the system's knowledge of relevant videos, a content-aware video buffer prefetching scheme was then integrated, using the abovementioned feedback to lower the user perceived latency on the client-end.
As an information retrieval system consists of many interacting components, a client-server video streaming model is first developed for clarity and simplicity. Using a CNN news video clip database, experiments were then conducted using this model to simulate user scenarios. As the aim of improving semantic relevancy sometimes opposes user interface tools for interactivity and user perceived latency, a quantitative evaluation was done to observe the tradeoffs between bandwidth, semantic relevance, and user perceived latency. Performance tradeoffs involving semantic relevancy and user perceived latency were then predicted.
In addition, complementary human user subjective tests are conducted with actual mobile phone hardware running on the Google Android platform. These experiments suggest that a mixed-initiative approach is helpful for recommending news video content on a mobile device for overcoming the mobile limitations of user interface tools for interactivity and client-end perceived latency. Users desired interactivity and responsiveness while viewing videos, and were willing to sacrifice some content relevancy in order gain lower perceived latency.
Recommended future work includes expanding the content recommendation to incorporate viewing data from a large population, and the creation of a global hybrid content-based and collaborative filtering algorithm for better results. Also, based on existing user behaviour, users were reluctant to provide more input than necessary. Additional user experiments can be designed to quantify user attention and interest during video watching on a mobile device, and for better definition and incorporation of implicit user feedback.
|
6 |
Informationsrecherche in Hypertext- und Multimedia-Dokumenten : Entwicklung eines kognitiven Navigationsmodells /Laus, Frank O. January 2001 (has links)
Zugl.: Münster (Westfalen), Universiẗat, Diss., 2000.
|
7 |
Generische Verkettung maschineller Ansätze der Bilderkennung durch Wissenstransfer in verteilten Systemen: Am Beispiel der Aufgabengebiete INS und ACTEv der Evaluationskampagne TRECVidRoschke, Christian 08 November 2021 (has links)
Der technologische Fortschritt im Bereich multimedialer Sensorik und zugehörigen Methoden zur Datenaufzeichnung, Datenhaltung und -verarbeitung führt im Big Data-Umfeld zu immensen Datenbeständen in Mediatheken und Wissensmanagementsystemen. Zugrundliegende State of the Art-Verarbeitungsalgorithmen werden oftmals problemorientiert entwickelt. Aufgrund der enormen Datenmengen lassen sich nur bedingt zuverlässig Rückschlüsse auf Güte und Anwendbarkeit ziehen. So gestaltet sich auch die intellektuelle Erschließung von großen Korpora schwierig, da die Datenmenge für valide Aussagen nahezu vollumfänglich semi-intellektuell zu prüfen wäre, was spezifisches Fachwissen aus der zugrundeliegenden Datendomäne ebenso voraussetzt wie zugehöriges Verständnis für Datenhandling und Klassifikationsprozesse. Ferner gehen damit gesonderte Anforderungen an Hard- und Software einher, welche in der Regel suboptimal skalieren, da diese zumeist auf Multi-Kern-Rechnern entwickelt und ausgeführt werden, ohne dabei eine notwendige Verteilung vorzusehen. Folglich fehlen Mechanismen, um die Übertragbarkeit der Verfahren auf andere Anwendungsdomänen zu gewährleisten. Die vorliegende Arbeit nimmt sich diesen Herausforderungen an und fokussiert auf die Konzeptionierung und Entwicklung einer verteilten holistischen Infrastruktur, die die automatisierte Verarbeitung multimedialer Daten im Sinne der Merkmalsextraktion, Datenfusion und Metadatensuche innerhalb eines homogenen Systems ermöglicht.
Der Fokus der vorliegenden Arbeit liegt in der Konzeptionierung und Entwicklung einer verteilten holistischen Infrastruktur, die die automatisierte Verarbeitung multimedialer Daten im Sinne der Merkmalsextraktion, Datenfusion und Metadatensuche innerhalb eines homogenen aber zugleich verteilten Systems ermöglicht. Dabei sind Ansätze aus den Domänen des Maschinellen Lernens, der Verteilten Systeme, des Datenmanagements und der Virtualisierung zielführend miteinander zu verknüpfen, um auf große Datenmengen angewendet, evaluiert und optimiert werden zu können. Diesbezüglich sind insbesondere aktuelle Technologien und Frameworks zur Detektion von Mustern zu analysieren und einer Leistungsbewertung zu unterziehen, so dass ein Kriterienkatalog ableitbar ist. Die so ermittelten Kriterien bilden die Grundlage für eine Anforderungsanalyse und die Konzeptionierung der notwendigen Infrastruktur. Diese Architektur bildet die Grundlage für Experimente im Big Data-Umfeld in kontextspezifischen Anwendungsfällen aus wissenschaftlichen Evaluationskampagnen, wie beispielsweise TRECVid. Hierzu wird die generische Applizierbarkeit in den beiden Aufgabenfeldern Instance Search und Activity in Extended Videos eruiert.:Abbildungsverzeichnis
Tabellenverzeichnis
1 Motivation
2 Methoden und Strategien
3 Systemarchitektur
4 Instance Search
5 Activities in Extended Video
6 Zusammenfassung und Ausblick
Anhang
Literaturverzeichnis / Technological advances in the field of multimedia sensing and related methods for data acquisition, storage, and processing are leading to immense amounts of data in media libraries and knowledge management systems in the Big Data environment. The underlying modern processing algorithms are often developed in a problem-oriented manner. Due to the enormous amounts of data, reliable statements about quality and applicability can only be made to a limited extent. Thus, the intellectual exploitation of large corpora is also difficult, as the data volume would have to be analyzed for valid statements, which requires specific expertise from the underlying data domain as well as a corresponding understanding of data handling and classification processes. In addition, there are separate requirements for hardware and software, which usually scale in a suboptimal manner while being developed and executed on multicore computers without provision for the required distribution. Consequently, there is a lack of mechanisms to ensure the transferability of the methods to other application domains.
The focus of this work is the design and development of a distributed holistic infrastructure that enables the automated processing of multimedia data in terms of feature extraction, data fusion, and metadata search within a homogeneous and simultaneously distributed system. In this context, approaches from the areas of machine learning, distributed systems, data management, and virtualization are combined in order to be applicable on to large data sets followed by evaluation and optimization procedures. In particular, current technologies and frameworks for pattern recognition are to be analyzed and subjected to a performance evaluation so that a catalog of criteria can be derived. The criteria identified in this way form the basis for a requirements analysis and the conceptual design of the infrastructure required. This architecture builds the base for experiments in the Big Data environment in context-specific use cases from scientific evaluation campaigns, such as TRECVid. For this purpose, the generic applicability in the two task areas Instance Search and Activity in Extended Videos is elicited.:Abbildungsverzeichnis
Tabellenverzeichnis
1 Motivation
2 Methoden und Strategien
3 Systemarchitektur
4 Instance Search
5 Activities in Extended Video
6 Zusammenfassung und Ausblick
Anhang
Literaturverzeichnis
|
Page generated in 0.1222 seconds