Global ETD Search

1	Conformal Predictions in Multimedia Pattern Recognition January 2010 (has links) abstract: The fields of pattern recognition and machine learning are on a fundamental quest to design systems that can learn the way humans do. One important aspect of human intelligence that has so far not been given sufficient attention is the capability of humans to express when they are certain about a decision, or when they are not. Machine learning techniques today are not yet fully equipped to be trusted with this critical task. This work seeks to address this fundamental knowledge gap. Existing approaches that provide a measure of confidence on a prediction such as learning algorithms based on the Bayesian theory or the Probably Approximately Correct theory require strong assumptions or often produce results that are not practical or reliable. The recently developed Conformal Predictions (CP) framework - which is based on the principles of hypothesis testing, transductive inference and algorithmic randomness - provides a game-theoretic approach to the estimation of confidence with several desirable properties such as online calibration and generalizability to all classification and regression methods. This dissertation builds on the CP theory to compute reliable confidence measures that aid decision-making in real-world problems through: (i) Development of a methodology for learning a kernel function (or distance metric) for optimal and accurate conformal predictors; (ii) Validation of the calibration properties of the CP framework when applied to multi-classifier (or multi-regressor) fusion; and (iii) Development of a methodology to extend the CP framework to continuous learning, by using the framework for online active learning. These contributions are validated on four real-world problems from the domains of healthcare and assistive technologies: two classification-based applications (risk prediction in cardiac decision support and multimodal person recognition), and two regression-based applications (head pose estimation and saliency prediction in images). The results obtained show that: (i) multiple kernel learning can effectively increase efficiency in the CP framework; (ii) quantile p-value combination methods provide a viable solution for fusion in the CP framework; and (iii) eigendecomposition of p-value difference matrices can serve as effective measures for online active learning; demonstrating promise and potential in using these contributions in multimedia pattern recognition problems in real-world settings. / Dissertation/Thesis / Ph.D. Computer Science 2010 Computer Science Artificial Intelligence Information Science Confidence estimation Conformal predictions Machine learning Multimedia computing Pattern recognition
2	Mining Semantics from Low-level Features in Multimedia Computing January 2011 (has links) abstract: Bridging semantic gap is one of the fundamental problems in multimedia computing and pattern recognition. The challenge of associating low-level signal with their high-level semantic interpretation is mainly due to the fact that semantics are often conveyed implicitly in a context, relying on interactions among multiple levels of concepts or low-level data entities. Also, additional domain knowledge may often be indispensable for uncovering the underlying semantics, but in most cases such domain knowledge is not readily available from the acquired media streams. Thus, making use of various types of contextual information and leveraging corresponding domain knowledge are vital for effectively associating high-level semantics with low-level signals with higher accuracies in multimedia computing problems. In this work, novel computational methods are explored and developed for incorporating contextual information/domain knowledge in different forms for multimedia computing and pattern recognition problems. Specifically, a novel Bayesian approach with statistical-sampling-based inference is proposed for incorporating a special type of domain knowledge, spatial prior for the underlying shapes; cross-modality correlations via Kernel Canonical Correlation Analysis is explored and the learnt space is then used for associating multimedia contents in different forms; model contextual information as a graph is leveraged for regulating interactions among high-level semantic concepts (e.g., category labels), low-level input signal (e.g., spatial/temporal structure). Four real-world applications, including visual-to-tactile face conversion, photo tag recommendation, wild web video classification and unconstrained consumer video summarization, are selected to demonstrate the effectiveness of the approaches. These applications range from classic research challenges to emerging tasks in multimedia computing. Results from experiments on large-scale real-world data with comparisons to other state-of-the-art methods and subjective evaluations with end users confirmed that the developed approaches exhibit salient advantages, suggesting that they are promising for leveraging contextual information/domain knowledge for a wide range of multimedia computing and pattern recognition problems. / Dissertation/Thesis / Ph.D. Computer Science 2011 Computer Science Applications Contextual Information/Domain knowledge Multimedia computing Semantic analysis
3	Amélioration de la transmission de contenus vidéo et de données dans les réseaux sans-fil / Improving the transmission of video and data in wireless networks Ramadan, Wassim 04 July 2011 (has links) Cette thèse traite de l’amélioration du transfert de données, d’une part sur les réseaux sans-fils et d’autre part pour des données continues telles que la vidéo. Pour améliorer les transmissions sur les réseaux sans-fils nous nous sommes intéressés au contrôle de congestion des protocoles de transport mais nous avons également proposé une méthode pratique d’adaptation de la vidéo aux conditions du réseau.Cette thèse contient donc deux volets. La première porte sur la différenciation de pertes entre les pertes de congestion et les pertes sur le réseau sans fil. Il est connu que lors d’une perte, les protocoles de transport actuels réduisent le débit (par deux par exemple). Or, pour les pertes sans fil, cela n’a pas d’intérêt. Pour différencier ces pertes sur l’émetteur des données, nous proposons une méthode originale qui utilise à la fois ECN (Explicit Congestion Notification) et le changement sur le RTT du paquet qui suit la perte. La seconde propose une méthode originale d’adaptation vidéo au niveau de la couche application sur l’émetteur. Avec l’arrivée des vidéos à bitrate élevés (HD, 3D) et l’augmentation constante mais irrégulière des bandes passantes réseau, la qualité vidéo à l’utilisateur reste à la traîne : elle est non-optimale (bitrate beaucoup plus petit ou plus grand que le débit disponible) et non adaptable (aux conditions dynamiques du réseau). Nous proposons une méthode très simple à implémenter, puisqu’elle ne requiert qu’une modification côté émetteur au niveau de la couche application. Elle adapte en permanence le bitrate de la vidéo aux conditions du réseau, autrement dit elle fait un contrôle de congestion sur l’émetteur. La visioconférence est un cas d’application idéal. Cette méthode fonctionne au-dessus de tout protocole de transport avec contrôle de congestion (TCP, DCCP), ce qui lui confère aussi la propriété de TCP-friendliness. / This thesis deals in improving the data transfer on wireless networks and for the continuous data such as video. To improve transmission over wireless networks, we were interested in congestion control transport protocols and we also proposed a practical method for adjusting the video rate to network conditions.This thesis composes of two parts. The first part concerns the loss differentiation between congestion losses and losses on the wireless network. It is known that when there is a loss, transport protocols reduce the current sending rate (e.g. by two). However, for wireless losses, it has no interest in reducing the rate. To differentiate these losses on the data senders side, we propose a novel method that uses both the ECN (Explicit Congestion Notification) and the change of RTT of the packet following the loss. The second part proposes a novel method for video adaptation at the application layer of the sender. With the advent of high bitrate video (e.g. HD, 3D) and steadily increasing but irregular network bandwidth, video quality to the user lags. It is non-optimal (bitrate is highly smaller or larger than the available bandwidth) and not adaptable (to the dynamic conditions of the network). We propose a simple method to implement, since it requires a change only at the application layer of the sender. It adapts the bitrate of the video to the network conditions, i.e. it is a congestion control on the transmitter. Videoconferencing is an ideal case for the application of adaptation. This method works over any transport protocol with congestion control (e.g. TCP, DCCP), which also confers the property of TCP-friendliness. Streaming Multimédia Adaptation de la vidéo Evitement d’oscillation Protocoles de transport Réseaux Sans-fil Différenciation de pertes Multimedia (computing) Video adaptation Transport protocols Networks Wireless Loss differentiation 004.6

1

Page generated in 0.0534 seconds