1 |
Vision and language understanding with localized evidenceXu, Huijuan 16 February 2019 (has links)
Enabling machines to solve computer vision tasks with natural language components can greatly improve human interaction with computers. In this thesis, we address vision and language tasks with deep learning methods that explicitly localize relevant visual evidence. Spatial evidence localization in images enhances the interpretability of the model, while temporal localization in video is necessary to remove irrelevant content. We apply our methods to various vision and language tasks, including visual question answering, temporal activity detection, dense video captioning and cross-modal retrieval.
First, we tackle the problem of image question answering, which requires the model to predict answers to questions posed about images. We design a memory network with a question-guided spatial attention mechanism which assigns higher weights to regions that are more relevant to the question. The visual evidence used to derive the answer can be shown by visualizing the attention weights in images. We then address the problem of localizing temporal evidence in videos. For most language/vision tasks, only part of the video is relevant to the linguistic component, so we need to detect these relevant events in videos. We propose an end-to-end model for temporal activity detection, which can detect arbitrary length activities by coordinate regression with respect to anchors and contains a proposal stage to filter out background segments, saving computation time. We further extend activity category detection to event captioning, which can express richer semantic meaning compared to a class label. This derives the problem of dense video captioning, which involves two sub-problems: localizing distinct events in long video and generating captions for the localized events. We propose an end-to-end hierarchical captioning model with vision and language context modeling in which the captioning training affects the activity localization. Lastly, the task of text-to-clip video retrieval requires one to localize the specified query instead of detecting and captioning all events. We propose a model based on the early fusion of words and visual features, outperforming standard approaches which embed the whole sentence before performing late feature fusion. Furthermore, we use queries to regulate the proposal network to generate query related proposals.
In conclusion, our proposed visual localization mechanism applies across a variety of vision and language tasks and achieves state-of-the-art results. Together with the inference module, our work can contribute to solving other tasks such as video question answering in future research.
|
2 |
Responses of Madagascar's Endemic Carnivores to Fragmentation, Hunting, and Exotic Carnivores Across the Masoala-Makira LandscapeFarris, Zachary J. 06 January 2015 (has links)
The carnivores of Madagascar are likely the least studied of the world's carnivores, thus little is known about threats to their persistence. I provide the first long-term assessment of Madagascar's rainforest carnivore community, including: 1) how multiple forms of habitat degradation (i.e., fragmentation, exotic carnivores, human encroachment, and hunting) affect native and exotic carnivore occupancy; 2) how native and exotic carnivore temporal activity overlap and how body size and niche explain these patterns; 3) how native and exotic carnivores spatially co-occur across the landscape and which variables explain these relationships; and 4) how native and exotic carnivores and humans co-occur with lemurs across Madagascar's largest protected landscape: the Masoala-Makira landscape. From 2008 to 2013 I photographically sampled carnivores and conducted line-transect surveys of lemurs at seven study sites with varying degrees of degradation and human encroachment, including repeat surveys of two sites. As degradation increased, exotic carnivores showed increases in activity and occupancy while endemic carnivore, small mammal, and lemur occupancy and/or activity decreased. Wild/feral cats (Felis sp.) and dogs (Canis familiaris) had higher occupancy (0.37 ± SE 0.08 and 0.61 ± SE 0.07, respectively) than half of the endemic carnivore species across the landscape. Additionally, exotic carnivores had both direct and indirect negative effects on native carnivore occupancy. For example, spotted fanaloka (Fossa fossana) occupancy (0.70 ± SE 0.07) was negatively impacted by both wild/feral cat (beta = -2.65) and Indian civets (beta = -1.20). My results revealed intense pressure from hunting (ex. n = 31 fosa Cryptoprocta ferox consumed per year from 2005-2011 across four villages), including evidence that hunters target intact forest where native carnivore and lemur occupancy and/or activity are highest. I found evidence of high temporal overlap between native and exotic carnivores (ex. temporal overlap between brown-tail vontsira Salanoia concolor and dogs is 0.88), including fosa (Cryptoprocta ferox) avoiding dogs and humans across all seasons. However, I found no evidence of body size or correlates of ecological niche explaining temporal overlap among carnivores. Estimates of spatial co-occurrence among native and exotic carnivores in rainforest habitat revealed strong evidence that native and exotic carnivores occur together less often than expected and that exotic carnivores may be replacing native carnivores in forests close to human settlements. For example, falanouc show a strong increase in occupancy when dogs are absent (0.69 ± SE 0.11) compared to when they are present (0.23 ± SE 0.05). Finally, the two-species interaction occupancy models for carnivores and lemurs, revealed a higher number of interactions among species across contiguous forest where carnivore and lemur occupancy were highest. These various anthropogenic pressures and their effects on carnivore and lemur populations, particularly increases in exotic carnivores and hunting, have wide-ranging, global implications and demand effective management plans to target the influx of exotic carnivores and unsustainable hunting affecting carnivore and primate populations across Madagascar and worldwide. / Ph. D.
|
3 |
Modelling temporal aspects of healthcare processes with Ontologies / Modelling temporal aspects of healthcare processes with OntologiesAfzal, Muhammad January 2010 (has links)
<p>This thesis represents the ontological model for the Time Aspects for a Healthcare Organization. It provides information about activities which take place at different interval of time at Ryhov Hospital. These activities are series of actions which may be happen in predefined sequence and at predefined times or may be happen at any time in a General ward or in Emergency ward of a Ryhov Hospital.</p><p>For achieving above mentioned objective, our supervisor conducts a workshop at the start of thesis. In this workshop, the domain experts explain the main idea of ward activities. From this workshop; the author got a lot of knowledge about activities and time aspects. After this, the author start literature review for achieving valuable knowledge about ward activities, time aspects and also methodology steps which are essentials for ontological model. After developing ontological model for Time Aspects, our supervisor also conducts a second workshop. In this workshop, the author presents the model for evaluation purpose.</p>
|
4 |
Modelling temporal aspects of healthcare processes with Ontologies / Modelling temporal aspects of healthcare processes with OntologiesAfzal, Muhammad January 2010 (has links)
This thesis represents the ontological model for the Time Aspects for a Healthcare Organization. It provides information about activities which take place at different interval of time at Ryhov Hospital. These activities are series of actions which may be happen in predefined sequence and at predefined times or may be happen at any time in a General ward or in Emergency ward of a Ryhov Hospital. For achieving above mentioned objective, our supervisor conducts a workshop at the start of thesis. In this workshop, the domain experts explain the main idea of ward activities. From this workshop; the author got a lot of knowledge about activities and time aspects. After this, the author start literature review for achieving valuable knowledge about ward activities, time aspects and also methodology steps which are essentials for ontological model. After developing ontological model for Time Aspects, our supervisor also conducts a second workshop. In this workshop, the author presents the model for evaluation purpose.
|
5 |
Adding temporal plasticity to a self-organizing incremental neural network using temporal activity diffusion / Om att utöka ett självorganiserande inkrementellt neuralt nätverk med temporal plasticitet genom temporal aktivitetsdiffusionLundberg, Emil January 2015 (has links)
Vector Quantization (VQ) is a classic optimization problem and a simple approach to pattern recognition. Applications include lossy data compression, clustering and speech and speaker recognition. Although VQ has largely been replaced by time-aware techniques like Hidden Markov Models (HMMs) and Dynamic Time Warping (DTW) in some applications, such as speech and speaker recognition, VQ still retains some significance due to its much lower computational cost — especially for embedded systems. A recent study also demonstrates a multi-section VQ system which achieves performance rivaling that of DTW in an application to handwritten signature recognition, at a much lower computational cost. Adding sensitivity to temporal patterns to a VQ algorithm could help improve such results further. SOTPAR2 is such an extension of Neural Gas, an Artificial Neural Network algorithm for VQ. SOTPAR2 uses a conceptually simple approach, based on adding lateral connections between network nodes and creating “temporal activity” that diffuses through adjacent nodes. The activity in turn makes the nearest-neighbor classifier biased toward network nodes with high activity, and the SOTPAR2 authors report improvements over Neural Gas in an application to time series prediction. This report presents an investigation of how this same extension affects quantization and prediction performance of the self-organizing incremental neural network (SOINN) algorithm. SOINN is a VQ algorithm which automatically chooses a suitable codebook size and can also be used for clustering with arbitrary cluster shapes. This extension is found to not improve the performance of SOINN, in fact it makes performance worse in all experiments attempted. A discussion of this result is provided, along with a discussion of the impact of the algorithm parameters, and possible future work to improve the results is suggested. / Vektorkvantisering (VQ; eng: Vector Quantization) är ett klassiskt problem och en enkel metod för mönsterigenkänning. Bland tillämpningar finns förstörande datakompression, klustring och igenkänning av tal och talare. Även om VQ i stort har ersatts av tidsmedvetna tekniker såsom dolda Markovmodeller (HMM, eng: Hidden Markov Models) och dynamisk tidskrökning (DTW, eng: Dynamic Time Warping) i vissa tillämpningar, som tal- och talarigenkänning, har VQ ännu viss relevans tack vare sin mycket lägre beräkningsmässiga kostnad — särskilt för exempelvis inbyggda system. En ny studie demonstrerar också ett VQ-system med flera sektioner som åstadkommer prestanda i klass med DTW i en tillämpning på igenkänning av handskrivna signaturer, men till en mycket lägre beräkningsmässig kostnad. Att dra nytta av temporala mönster i en VQ-algoritm skulle kunna hjälpa till att förbättra sådana resultat ytterligare. SOTPAR2 är en sådan utökning av Neural Gas, en artificiell neural nätverk-algorithm för VQ. SOTPAR2 använder en konceptuellt enkel idé, baserad på att lägga till sidleds anslutningar mellan nätverksnoder och skapa “temporal aktivitet” som diffunderar genom anslutna noder. Aktiviteten gör sedan så att närmaste-granne-klassificeraren föredrar noder med hög aktivitet, och författarna till SOTPAR2 rapporterar förbättrade resultat jämfört med Neural Gas i en tillämpning på förutsägning av en tidsserie. I denna rapport undersöks hur samma utökning påverkar kvantiserings- och förutsägningsprestanda hos algoritmen självorganiserande inkrementellt neuralt nätverk (SOINN, eng: self-organizing incremental neural network). SOINN är en VQ-algorithm som automatiskt väljer en lämplig kodboksstorlek och också kan användas för klustring med godtyckliga klusterformer. Experimentella resultat visar att denna utökning inte förbättrar prestandan hos SOINN, istället försämrades prestandan i alla experiment som genomfördes. Detta resultat diskuteras, liksom inverkan av parametervärden på prestandan, och möjligt framtida arbete för att förbättra resultaten föreslås.
|
Page generated in 0.5511 seconds