Global ETD Search

81	Modeling and Measuring Cognitive Load to Reduce Driver Distraction in Smart Cars January 2015 (has links) abstract: Driver distraction research has a long history spanning nearly 50 years, intensifying in the last decade. The focus has always been on identifying the distractive tasks and measuring the respective harm level. As in-vehicle technology advances, the list of distractive activities grows along with crash risk. Additionally, the distractive activities become more common and complicated, especially with regard to In-Car Interactive System. This work's main focus is on driver distraction caused by the in-car interactive System. There have been many User Interaction Designs (Buttons, Speech, Visual) for Human-Car communication, in the past and currently present. And, all related studies suggest that driver distraction level is still high and there is a need for a better design. Multimodal Interaction is a design approach, which relies on using multiple modes for humans to interact with the car & hence reducing driver distraction by allowing the driver to choose the most suitable mode with minimum distraction. Additionally, combining multiple modes simultaneously provides more natural interaction, which could lead to less distraction. The main goal of MMI is to enable the driver to be more attentive to driving tasks and spend less time fiddling with distractive tasks. Engineering based method is used to measure driver distraction. This method uses metrics like Reaction time, Acceleration, Lane Departure obtained from test cases. / Dissertation/Thesis / presentation / REACTION TIMES / DRIVING DATA RESULTS / Masters Thesis Computer Science 2015 Computer science Computer engineering Automotive engineering cognitive load driver distraction human-car interaction modeling multi-modal smart cars
82	Att utvärdera AdApt, ett multimodalt konverserande dialogsystem, med PARADISE / Evaluating AdApt, a multi-modal conversational, dialogue system, using PARADISE Hjalmarsson, Anna January 2003 (has links) This master’s thesis presents experiences from an evaluation of AdApt, a multi- modal, conversational dialogue system, using PARADISE, PARAdigm for Dialogue System Evaluation, a general framework for evaluation. The purpose of this master’s thesis was to assess PARADISE as an evaluation tool for such a system. An experimental study with 26 subjects was performed. The subjects were asked to interact with one of three different system versions of AdApt. Data was collected through questionnaires, hand tagging of the dialogues and automatic logging of the interaction. Analysis of the results suggests that further research is needed to develop a general framework for evaluation which is easy to apply and can be used for varying kinds of spoken dialogue systems. The data collected in this study can be used as starting point for further research. Interdisciplinary studies PARADISE AdApt system evaluation dialogue system multi-modal TVÄRVETENSKAP Social Sciences Interdisciplinary
83	A Unified Decision Framework for Multi-Modal Traffic Signal Control Optimization in a Connected Vehicle Environment Zamanipour, Mehdi, Zamanipour, Mehdi January 2016 (has links) Motivated by recent advances in vehicle positioning and vehicle-to-infrastructure (V2I) communication, traffic signal controllers are able to make smarter decisions. Most of the current state-of-the-practice signal priority control systems aim to provide priority for only one mode or based on first-come-first-served logic. Consideration of priority control in a more general framework allows for several different modes of travelers to request priority at any time from any approach and for other traffic control operating principles, such as coordination, to be considered within an integrated signal timing framework. This leads to provision of priority to connected priority eligible vehicles with minimum negative impact on regular vehicles. This dissertation focuses on providing a real-time decision making framework for multi modal traffic signal control that considers several transportation modes in a unified framework using Connected Vehicle (CV) technologies. The unified framework is based on a systems architecture for CVs that is applicable in both simulated and real world (field) testing conditions. The system architecture is used to design both hardware-in-the-loop and software-in-the-loop CV simulation environment. A real-time priority control optimization model and an implementation algorithm are developed using priority eligible vehicles data. The optimization model is extended to include signal coordination concepts. As the penetration rate of the CVs increases, the ability to predict the queue more accurately increases. It is shown that accurate queue prediction improves the performance of the optimization model in reducing priority eligible vehicles delay. The model is generalized to consider regular CVs as well as priority vehicles and coordination priority requests in a unified mathematical model. It is shown than the model can react properly to the decision makers' modal preferences. Mathematical Modeling Multi-Modal Traffic Signal System Optimization Traffic Signal Priority Control Systems & Industrial Engineering Connected Vehicle Technology
84	Multi-modální "Restricted Boltzmann Machines" / Multi-Modal Restricted Boltzmann Machines Svoboda, Jiří January 2013 (has links) This thesis explores how multi-modal Restricted Boltzmann Machines (RBM) can be used in content-based image tagging. This work also cointains brief analysis of modalities that can be used for multi-modal classification. There are also described various RBMs, that are suitable for different kinds of input data. A design and implementation of multimodal RBM is described together with results of preliminary experiments.
85	Detecting Non-Natural Objects in a Natural Environment using Generative Adversarial Networks with Stereo Data Gehlin, Nils, Antonsson, Martin January 2020 (has links) This thesis investigates the use of Generative Adversarial Networks (GANs) for detecting images containing non-natural objects in natural environments and if the introduction of stereo data can improve the performance. The state-of-the-art GAN-based anomaly detection method presented by A. Berget al. in [5] (BergGAN) was the base of this thesis. By modifiying BergGAN to not only accept three channel input, but also four and six channel input, it was possible to investigate the effect of introducing stereo data in the method. The input to the four channel network was an RGB image and its corresponding disparity map, and the input to the six channel network was a stereo pair consistingof two RGB images. The three datasets used in the thesis were constructed froma dataset of aerial video sequences provided by SAAB Dynamics, where the scene was mostly wooded areas. The datasets were divided into training and validation data, where the latter was used for the performance evaluation of the respective network. The evaluation method suggested in [5] was used in the thesis, where each sample was scored on the likelihood of it containing anomalies, Receiver Operating Characteristics (ROC) analysis was then applied and the area under the ROC-curve was calculated. The results showed that BergGAN was successfully able to detect images containing non-natural objects in natural environments using the dataset provided by SAAB Dynamics. The adaption of BergGAN to also accept four and six input channels increased the performance of the method, showing that there is information in stereo data that is relevant for GAN-based anomaly detection. There was however no substantial performance difference between the network trained with two RGB images versus the one trained with an RGB image and its corresponding disparity map. deep learning anomaly detection GAN BergGAN pGAN stereo non-natural objects natural environment multi-modal anomaly detection Signal Processing Signalbehandling
86	Robust and comprehensive joint image-text representations / Recherche multimédia à large échelle Tran, Thi Quynh Nhi 03 May 2017 (has links) La présente thèse étudie la modélisation conjointe des contenus visuels et textuels extraits à partir des documents multimédias pour résoudre les problèmes intermodaux. Ces tâches exigent la capacité de ``traduire'' l'information d'une modalité vers une autre. Un espace de représentation commun, par exemple obtenu par l'Analyse Canonique des Corrélation ou son extension kernelisée est une solution généralement adoptée. Sur cet espace, images et texte peuvent être représentés par des vecteurs de même type sur lesquels la comparaison intermodale peut se faire directement.Néanmoins, un tel espace commun souffre de plusieurs déficiences qui peuvent diminuer la performance des ces tâches. Le premier défaut concerne des informations qui sont mal représentées sur cet espace pourtant très importantes dans le contexte de la recherche intermodale. Le deuxième défaut porte sur la séparation entre les modalités sur l'espace commun, ce qui conduit à une limite de qualité de traduction entre modalités. Pour faire face au premier défaut concernant les données mal représentées, nous avons proposé un modèle qui identifie tout d'abord ces informations et puis les combine avec des données relativement bien représentées sur l'espace commun. Les évaluations sur la tâche d'illustration de texte montrent que la prise en compte de ces information fortement améliore les résultats de la recherche intermodale. La contribution majeure de la thèse se concentre sur la séparation entre les modalités sur l'espace commun pour améliorer la performance des tâches intermodales. Nous proposons deux méthodes de représentation pour les documents bi-modaux ou uni-modaux qui regroupent à la fois des informations visuelles et textuelles projetées sur l'espace commun. Pour les documents uni-modaux, nous suggérons un processus de complétion basé sur un ensemble de données auxiliaires pour trouver les informations correspondantes dans la modalité absente. Ces informations complémentaires sont ensuite utilisées pour construire une représentation bi-modale finale pour un document uni-modal. Nos approches permettent d'obtenir des résultats de l'état de l'art pour la recherche intermodale ou la classification bi-modale et intermodale. / This thesis investigates the joint modeling of visual and textual content of multimedia documents to address cross-modal problems. Such tasks require the ability to match information across modalities. A common representation space, obtained by eg Kernel Canonical Correlation Analysis, on which images and text can be both represented and directly compared is a generally adopted solution.Nevertheless, such a joint space still suffers from several deficiencies that may hinder the performance of cross-modal tasks. An important contribution of this thesis is therefore to identify two major limitations of such a space. The first limitation concerns information that is poorly represented on the common space yet very significant for a retrieval task. The second limitation consists in a separation between modalities on the common space, which leads to coarse cross-modal matching. To deal with the first limitation concerning poorly-represented data, we put forward a model which first identifies such information and then finds ways to combine it with data that is relatively well-represented on the joint space. Evaluations on emph{text illustration} tasks show that by appropriately identifying and taking such information into account, the results of cross-modal retrieval can be strongly improved. The major work in this thesis aims to cope with the separation between modalities on the joint space to enhance the performance of cross-modal tasks.We propose two representation methods for bi-modal or uni-modal documents that aggregate information from both the visual and textual modalities projected on the joint space. Specifically, for uni-modal documents we suggest a completion process relying on an auxiliary dataset to find the corresponding information in the absent modality and then use such information to build a final bi-modal representation for a uni-modal document. Evaluations show that our approaches achieve state-of-the-art results on several standard and challenging datasets for cross-modal retrieval or bi-modal and cross-modal classification. Espace commun de representation Recherche intermodale Représentation multimodale Common representation space Cross-Modal classification/retrieval Multi-Modal representation 003.3
87	Robust Audio Scene Analysis for Rescue Robots / レスキューロボットのための頑健な音環境理解 Bando, Yoshiaki 26 March 2018 (has links) 京都大学 / 0048 / 新制・課程博士 / 博士(情報学) / 甲第21209号 / 情博第662号 / 新制\|\|情\|\|114(附属図書館) / 京都大学大学院情報学研究科知能情報学専攻 / (主査)教授河原達也, 教授鹿島久嗣, 教授田中利幸, 講師吉井和佳 / 学位規則第4条第1項該当 / Doctor of Informatics / Kyoto University / DFAM Audio signal processing Multi-modal signal processing Rescue robotics Speech enhancement Posture estimation Hose-shaped rescue robot 007
88	Multi-Scale and Multi-Modal Streaming Data Aggregation and Processing for Decision Support during Natural Disasters Kar, Shruti January 2018 (has links) No description available. Computer Science multi-modal data natural disasters disaster-related tweets gazetteers relief effort coordination OpenStreetMap geolocate flood mapping DisasterRecord
89	Drug Loaded Multifunctional Microparticles for Anti-VEGF Therapy of Exudative Age-related Macular Degeneration Zhang, Leilei January 2012 (has links) No description available. Biomedical Engineering microparticles age-related macular degeneration AMD VEGF drug delivery electrohydrodanamic electrospray multi-modal imaging coaxial instability analysis
90	Exploring Multi-Domain and Multi-Modal Representations for Unsupervised Image-to-Image Translation Liu, Yahui 20 May 2022 (has links) Unsupervised image-to-image translation (UNIT) is a challenging task in the image manipulation field, where input images in a visual domain are mapped into another domain with desired visual patterns (also called styles). An ideal direction in this field is to build a model that can map an input image in a domain to multiple target domains and generate diverse outputs in each target domain, which is termed as multi-domain and multi-modal unsupervised image-to-image translation (MMUIT). Recent studies have shown remarkable results in UNIT but they suffer from four main limitations: (1) State-of-the-art UNIT methods are either built from several two-domain mappings that are required to be learned independently or they generate low-diversity results, a phenomenon also known as model collapse. (2) Most of the manipulation is with the assistance of visual maps or digital labels without exploring natural languages, which could be more scalable and flexible in practice. (3) In an MMUIT system, the style latent space is usually disentangled between every two image domains. While interpolations within domains are smooth, interpolations between two different domains often result in unrealistic images with artifacts when interpolating between two randomly sampled style representations from two different domains. Improving the smoothness of the style latent space can lead to gradual interpolations between any two style latent representations even between any two domains. (4) It is expensive to train MMUIT models from scratch at high resolution. Interpreting the latent space of pre-trained unconditional GANs can achieve pretty good image translations, especially high-quality synthesized images (e.g., 1024x1024 resolution). However, few works explore building an MMUIT system with such pre-trained GANs. In this thesis, we focus on these vital issues and propose several techniques for building better MMUIT systems. First, we base on the content-style disentangled framework and propose to fit the style latent space with Gaussian Mixture Models (GMMs). It allows a well-trained network using a shared disentangled style latent space to model multi-domain translations. Meanwhile, we can randomly sample different style representations from a Gaussian component or use a reference image for style transfer. Second, we show how the GMM-modeled latent style space can be combined with a language model (e.g., a simple LSTM network) to manipulate multiple styles by using textual commands. Then, we not only propose easy-to-use constraints to improve the smoothness of the style latent space in MMUIT models, but also design a novel metric to quantitatively evaluate the smoothness of the style latent space. Finally, we build a new model to use pretrained unconditional GANs to do MMUIT tasks.

Search results