Global ETD Search

151	CLIP-RS: A Cross-modal Remote Sensing Image Retrieval Based on CLIP, a Northern Virginia Case Study Djoufack Basso, Larissa 21 June 2022 (has links) Satellite imagery research used to be an expensive research topic for companies and organizations due to the limited data and compute resources. As the computing power and storage capacity grows exponentially, a large amount of aerial and satellite images are generated and analyzed everyday for various applications. Current technological advancement and extensive data collection by numerous Internet of Things (IOT) devices and platforms have amplified labeled natural images. Such data availability catalyzed the development and performance of current state-of-the-art image classification and cross-modal models. Despite the abundance of publicly available remote sensing images, very few remote sensing (RS) images are labeled and even fewer are multi-captioned.These scarcities limit the scope of fine tuned state of the art models to at most 38 classes, based on the PatternNet data, one of the largest publicly available labeled RS data. Recent state-of-the art image-to-image retrieval and detection models in RS have shown great results. Because the text-to-image retrieval of RS images is still emerging, it still faces some challenges in the retrieval of those images.These challenges are based on the inaccurate retrieval of image categories that were not present in the training dataset and the retrieval of images from descriptive input. Motivated by those shortcomings in current cross-modal remote sensing image retrieval, we proposed CLIP-RS, a cross-modal remote sensing image retrieval platform. Our proposed framework CLIP-RS is a framework that combines a fine-tuned implementation of a recent state of the art cross-modal and text-based image retrieval model, Contrastive Language Image Pre-training (CLIP) and FAISS (Facebook AI similarity search), a library for efficient similarity search. Our implementation is deployed on a Web App for inference task on text-to-image and image-to-image retrieval of RS images collected via the Mapbox GL JS API. We used the free tier option of the Mapbox GL JS API and took advantage of its raster tiles option to locate the retrieved results on a local map, a combination of the downloaded raster tiles. Other options offered on our platform are: image similarity search, locating an image in the map, view images' geocoordinates and addresses.In this work we also proposed two remote sensing fine-tuned models and conducted a comparative analysis of our proposed models with a different fine-tuned model as well as the zeroshot CLIP model on remote sensing data. / Master of Science / Satellite imagery research used to be an expensive research topic for companies and organizations due to the limited data and compute resources. As the computing power and storage capacity grows exponentially, a large amount of aerial and satellite images are generated and analyzed everyday for various applications. Current technological advancement and extensive data collection by numerous Internet of Things (IOT) devices and platforms have amplified labeled natural images. Such data availability catalyzed the devel- opment and performance of current state-of-the-art image classification and cross-modal models. Despite the abundance of publicly available remote sens- ing images, very few remote sensing (RS) images are labeled and even fewer are multi-captioned.These scarcities limit the scope of fine tuned state of the art models to at most 38 classes, based on the PatternNet data,one of the largest publicly avail- able labeled RS data.Recent state-of-the art image-to-image retrieval and detection models in RS have shown great results. Because the text-to-image retrieval of RS images is still emerging, it still faces some challenges in the re- trieval of those images.These challenges are based on the inaccurate retrieval of image categories that were not present in the training dataset and the re- trieval of images from descriptive input. Motivated by those shortcomings in current cross-modal remote sensing image retrieval, we proposed CLIP-RS, a cross-modal remote sensing image retrieval platform.Cross-modal retrieval focuses on data retrieval across different modalities and in the context of this work, we focus on textual and imagery modalities.Our proposed frame- work CLIP-RS is a framework that combines a fine-tuned implementation of a recent state of the art cross-modal and text-based image retrieval model, Contrastive Language Image Pre-training (CLIP) and FAISS (Facebook AI similarity search), a library for efficient similarity search. In deep learning, the concept of fine tuning consists of using weights from a different model or algorithm into a similar model with different domain specific application. Our implementation is deployed on a Web Application for inference tasks on text-to-image and image-to-image retrieval of RS images collected via the Mapbox GL JS API. We used the free tier option of the Mapbox GL JS API and took advantage of its raster tiles option to locate the retrieved results on a local map, a combination of the downloaded raster tiles. Other options offered on our platform are: image similarity search, locating an image in the map, view images' geocoordinates and addresses.In this work we also pro- posed two remote sensing fine-tuned models and conducted a comparative analysis of our proposed models with a different fine-tuned model as well as the zeroshot CLIP model on remote sensing data. Detection models in RS have shown great results. Because the text-to-image retrieval of RS images is still emerging, it still faces some challenges in the re- trieval of those images.These challenges are based on the inaccurate retrieval of image categories that were not present in the training dataset and the re- trieval of images from descriptive input. Motivated by those shortcomings in current cross-modal remote sensing image retrieval, we proposed CLIP-RS, a cross-modal remote sensing image retrieval platform.Cross-modal retrieval focuses on data retrieval across different modalities and in the context of this work, we focus on textual and imagery modalities.Our proposed frame- work CLIP-RS is a framework that combines a fine-tuned implementation of a recent state of the art cross-modal and text-based image retrieval model, Contrastive Language Image Pre-training (CLIP) and FAISS (Facebook AI similarity search), a library for efficient similarity search. In deep learning, the concept of fine tuning consists of using weights from a different model or algorithm into a similar model with different domain specific application. Our implementation is deployed on a Web Application for inference tasks on text-to-image and image-to-image retrieval of RS images collected via the Mapbox GL JS API. We used the free tier option of the Mapbox GL JS API and took advantage of its raster tiles option to locate the retrieved results on a local map, a combination of the downloaded raster tiles. Other options offered on our platform are: image similarity search, locating an image in the map, view images' geocoordinates and addresses.In this work we also pro- posed two remote sensing fine-tuned models and conducted a comparative analysis of our proposed models with a different fine-tuned model as well as the zeroshot CLIP model on remote sensing data. Remote Sensing Image Retrieval Textual input Spatial Database Indexing and Retrieval Contrastive Learning Cross-modal
152	Self-supervised Representation Learning in Computer Vision and Reinforcement Learning Ermolov, Aleksandr 06 December 2022 (has links) This work is devoted to self-supervised representation learning (SSL). We consider both contrastive and non-contrastive methods and present a new loss function for SSL based on feature whitening. Our solution is conceptually simple and competitive with other methods. Self-supervised representations are beneficial for most areas of deep learning, and reinforcement learning is of particular interest because SSL can compensate for the sparsity of the training signal. We present two methods from this area. The first tackles the partial observability providing the agent with a history, represented with temporal alignment, and improves performance in most Atari environments. The second addresses the exploration problem. The method employs a world model of the SSL latent space, and the prediction error of this model indicates novel states required to explore. It shows strong performance on exploration-hard benchmarks, especially on the notorious Montezuma's Revenge. Finally, we consider the metric learning problem, which has much in common with SSL approaches. We present a new method based on hyperbolic embeddings, vision transformers and contrastive loss. We demonstrate the advantage of hyperbolic space over the widely used Euclidean space for metric learning. The method outperforms the current state-of-the-art by a significant margin.
153	An Evolutionary Approximation to Contrastive Divergence in Convolutional Restricted Boltzmann Machines McCoppin, Ryan R. January 2014 (has links) No description available. Computer Science CRBM evolutionary algorithm contrastive divergence RBM machine learning deep learning
154	IDENTICAL CONSTITUENT COMPOUNDING: A CONCEPTUAL INTEGRATION-BASED MODEL Benjamin, Brandon Lee 31 May 2018 (has links) No description available. Communication Language Linguistics Identical Constituent Compound blending Conceptual Integration Theory prototype contrastive focus repair
155	A Contrastive Study of the Intercultural Differences in People’s Reactions Based on Their Cultures Oghanian, Mina January 2016 (has links) No description available. English As A Second Language Contrastive study Apology strategies Speech acts Illocutionary acts
156	The Development of Children’s Processing of English Pitch Accents in a Visual Search Task Bibyk, Sarah Alaine 08 September 2010 (has links) No description available. Linguistics sentence processing prosody pitch accent contrastive accent L+H* eyetracking development
157	Contrastive Filtering And Dual-Objective Supervised Learning For Novel Class Discovery In Document-Level Relation Extraction Hansen, Nicholas 01 June 2024 (has links) (PDF) Relation extraction (RE) is a task within natural language processing focused on the classification of relationships between entities in a given text. Primary applications of RE can be seen in various contexts such as knowledge graph construction and question answering systems. Traditional approaches to RE tend towards the prediction of relationships between exactly two entity mentions in small text snippets. However, with the introduction of datasets such as DocRED, research in this niche has progressed into examining RE at the document-level. Document-level relation extraction (DocRE) disrupts conventional approaches as it inherently introduces the possibility of multiple mentions of each unique entity throughout the document along with a significantly higher probability of multiple relationships between entity pairs. There have been many effective approaches to document-level RE in recent years utilizing various architectures, such as transformers and graph neural networks. However, all of these approaches focus on the classification of a fixed number of known relationships. As a result of the large quantity of possible unique relationships in a given corpus, it is unlikely that all interesting and valuable relationship types are labeled before hand. Furthermore, traditional naive approaches to clustering on unlabeled data to discover novel classes are not effective as a result of the unique problem of large true negative presence. Therefore, in this work we propose a multi-step filter and train approach leveraging the notion of contrastive representation learning to discover novel relationships at the document level. Additionally, we propose the use of an alternative pretrained encoder in an existing DocRE solution architecture to improve F1 performance in base multi-label classification on the DocRED dataset by 0.46. To the best of our knowledge, this is the first exploration of novel class discovery applied to the document-level RE task. Based upon our holdout evaluation method, we increase novel class instance representation in the clustering solution by 5.5 times compared to the naive approach and increase the purity of novel class clusters by nearly 4 times. We then further enable the retrieval of both novel and known classes at test time provided human labeling of cluster propositions achieving a macro F1 score of 0.292 for novel classes. Finally, we note only a slight macro F1 decrease on previously known classes from 0.402 with fully supervised training to 0.391 with our novel class discovery training approach. Document-Level Relation Extraction Contrastive Learning Novel Class Discovery Data Science
158	Análisis comparativo de las estrategias metadiscursivas en el género del debate electoral en España y Estados Unidos Albalat Mascarell, Ana 17 May 2021 (has links) [ES] Entendida como la disciplina que aborda el estudio de los usos comunicativos del lenguaje, la pragmática ha sido uno de los centros de la investigación en lingüística de las últimas décadas (Escandell Vidal, 2013). En línea con esta orientación teórica eminentemente funcional que otorga preeminencia tanto a la interacción emisor-receptor como al contexto de enunciación, el metadiscurso (Hyland y Tse, 2004; Hyland, 2018) se erige como un paradigma de análisis fiable en la medida en que proporciona un marco conceptual para comprender las diversas estrategias interpersonales que atienden al modo en el que el hablante organiza su discurso y se relaciona con su destinatario, desvelando, por tanto, las prácticas retóricas propias de diferentes comunidades lingüísticas y culturales. La presente propuesta busca examinar estos rasgos y patrones metadiscursivos en el discurso político hablado de las elecciones en España y los Estados Unidos, estableciendo así las bases para una investigación de las estrategias del discurso que acompañan y refuerzan al acto de habla. En este sentido, se pretende avanzar en una doble dirección epistemológica, revelando por un lado las estrategias interpersonales arriba citadas que expresan la función de persuasión en una modalidad concreta, la política, y por otro descubriendo la variabilidad interlingüística e intercultural de estas marcas persuasivas y de adecuación a las exigencias del contexto comunicativo. Todo ello se consigue al estudiar, desde una óptica comparativa, los debates electorales destinados a las comunidades española y estadounidense, aplicando los presupuestos teóricos y metodológicos ya mencionados sobre una gran base documental, es decir, avalada por procedimientos propios de la lingüística de corpus (Baker, 2010). / [CA] Entesa com la disciplina que aborda l'estudi dels usos comunicatius del llenguatge, la pragmàtica ha estat un dels centres de la investigació en lingüística de les últimes dècades (Escandell Vidal, 2013). En línea amb aquesta orientació teòrica eminentment funcional que atorga importància tant a la interacció emissor-receptor com al context d'enunciació, el metadiscurs (Hyland y Tse, 2004; Hyland, 2018), s'erigeix com un paradigma d'anàlisi fiable en la mesura en què proporciona un marc conceptual per comprendre les diverses estratègies interpersonals que atenen la manera en què l'autor organitza el text i es relaciona amb el destinatari, desvetlant, per tant, les practiques retòriques pròpies de diferents comunitats lingüístiques i culturals. La present proposta busca examinar aquests trets i patrons metadiscursius en el discurs polític parlat de les eleccions a Espanya i els Estats Units per tal d'establir les bases per a una investigació de les estratègies del discurs que acompanyen i reforcen l'acte de parla. En aquest sentit, pretenem avançar en una doble direcció epistemològica, examinant d'una banda les estratègies interpersonals adès citades que expressen la funció de persuasió en una modalitat concreta, la política, i d'altra descobrint la variabilitat interlingüística i intercultural d'aquests marcadors persuasius i d'adequació a les exigències del context comunicatiu. Tot això ho aconseguim arran d'estudiar, des d'una òptica comparativa, els debats electorals destinats a les comunitats espanyola i nord-americana, aplicant els pressupostos teòrics i metodològics ja esmentats sobre una gran base documental, és a dir, avalada per procediments propis de la lingüística de corpus (Baker, 2010). / [EN] Characterized as the discipline that focuses on the research of the communicative uses of language, pragmatics has been one of the main areas of linguistics in the last few decades (Escandell Vidal, 2013). In line with this functional orientation that gives prominence to both speaker-hearer interactions and the broad socio-cultural context, the interpersonal metadiscourse model (Hyland y Tse, 2004; Hyland, 2018) emerges as a reliable analytical framework since it provides a way to understand the diverse interpersonal strategies addressing the way in which speakers can organize their own discourse and engage with audiences, thus revealing diverse persuasive resources embedded in particular linguistic and cultural communities. The present proposal seeks to explore these metadiscursive traits and patterns in speeches belonging to the political election campaigns taking place in Spain and the United States, thus establishing the foundations for an investigation of the metadiscourse features that serve and reinforce the speech act. In this sense, it is intended to advance in a two-fold epistemological direction, signalling, on the one hand, the interpersonal strategies that accomplish the role of persuasion in a particular modality, the political one, and, on the other hand, exploring the cross-linguistic and cross-cultural variability of these persuasive markers adapted to the characteristics and demands of a given communicative context. This can be achieved by analizing, from a comparative perspective, the election campaign debates addressed to the Spanish and North-American communities, adjusting to the theories and analytical methods endorsed by corpus linguistics (Baker, 2010). / Albalat Mascarell, A. (2021). Análisis comparativo de las estrategias metadiscursivas en el género del debate electoral en España y Estados Unidos [Tesis doctoral]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/166438 Electoral programs Electoral campaigns Metadiscourse Contrastive rhetoric Pragmatics Pragmática Retórica contrastiva Metadiscurso Debate electoral FILOLOGIA INGLESA
159	Bridging Machine Learning and Experimental Design for Enhanced Data Analysis and Optimization Guo, Qing 19 July 2024 (has links) Experimental design is a powerful tool for gathering highly informative observations using a small number of experiments. The demand for smart data collection strategies is increasing due to the need to save time and budget, especially in online experiments and machine learning. However, the traditional experimental design method falls short in systematically assessing changing variables' effects. Specifically within Artificial Intelligence (AI), the challenge lies in assessing the impacts of model structures and training strategies on task performances with a limited number of trials. This shortfall underscores the necessity for the development of novel approaches. On the other side, the optimal design criterion has typically been model-based in classic design literature, which leads to restricting the flexibility of experimental design strategies. However, machine learning's inherent flexibility can empower the estimation of metrics efficiently using nonparametric and optimization techniques, thereby broadening the horizons of experimental design possibilities. In this dissertation, the aim is to develop a set of novel methods to bridge the merits between these two domains: 1) applying ideas from statistical experimental design to enhance data efficiency in machine learning, and 2) leveraging powerful deep neural networks to optimize experimental design strategies. This dissertation consists of 5 chapters. Chapter 1 provides a general introduction to mutual information, fractional factorial design, hyper-parameter tuning, multi-modality, etc. In Chapter 2, I propose a new mutual information estimator FLO by integrating techniques from variational inference (VAE), contrastive learning, and convex optimization. I apply FLO to broad data science applications, such as efficient data collection, transfer learning, fair learning, etc. Chapter 3 introduces a new design strategy called multi-layer sliced design (MLSD) with the application of AI assurance. It focuses on exploring the effects of hyper-parameters under different models and optimization strategies. Chapter 4 investigates classic vision challenges via multimodal large language models by implicitly optimizing mutual information and thoroughly exploring training strategies. Chapter 5 concludes this proposal and discusses several future research topics. / Doctor of Philosophy / In the digital age, artificial intelligence (AI) is reshaping our interactions with technology through advanced machine learning models. These models are complex, often opaque mechanisms that present challenges in understanding their inner workings. This complexity necessitates numerous experiments with different settings to optimize performance, which can be costly. Consequently, it is crucial to strategically evaluate the effects of various strategies on task performance using a limited number of trials. The Design of Experiments (DoE) offers invaluable techniques for investigating and understanding these complex systems efficiently. Moreover, integrating machine learning models can further enhance the DoE. Traditionally, experimental designs pre-specify a model and focus on finding the best strategies for experimentation. This assumption can restrict the adaptability and applicability of experimental designs. However, the inherent flexibility of machine learning models can enhance the capabilities of DoE, unlocking new possibilities for efficiently optimizing experimental strategies through an information-centric approach. Moreover, the information-based method can also be beneficial in other AI applications, including self-supervised learning, fair learning, transfer learning, etc. The research presented in this dissertation aims to bridge machine learning and experimental design, offering new insights and methodologies that benefit both AI techniques and DoE. Mutual Information Sliced Design Bayesian Optimal Design Induced Lasso Few-shot Learning Variational Inference Contrastive Learning
160	Facial Motion Augmented Identity Verification with Deep Neural Networks Sun, Zheng 06 October 2023 (has links) (PDF) Identity verification is ubiquitous in our daily life. By verifying the user's identity, the authorization process grants the privilege to access resources or facilities or perform certain tasks. The traditional and most prevalent authentication method is the personal identification number (PIN) or password. While these knowledge-based credentials could be lost or stolen, human biometric-based verification technologies have become popular alternatives in recent years. Nowadays, more people are used to unlocking their smartphones using their fingerprint or face instead of the conventional passcode. However, these biometric approaches have their weaknesses. For example, fingerprints could be easily fabricated, and a photo or image could spoof the face recognition system. In addition, these existing biometric-based identity verification methods could continue if the user is unaware, sleeping, or even unconscious. Therefore, an additional level of security is needed. In this dissertation, we demonstrate a novel identity verification approach, which makes the biometric authentication process more secure. Our approach requires only one regular camera to acquire a short video for computing the face and facial motion representations. It takes advantage of the advancements in computer vision and deep learning techniques. Our new deep neural network model, or facial motion encoder, can generate a representation vector for the facial motion in the video. Then the decision algorithm compares the vector to the enrolled facial motion vector to determine their similarity for identity verification. We first proved its feasibility through a keypoint-based method. After that, we built a curated dataset and proposed a novel representation learning framework for facial motions. The experimental results show that this facial motion verification approach reaches an average precision of 98.8\%, which is more than adequate for customary use. We also tested this algorithm on complex facial motions and proposed a new self-supervised pretraining approach to boost the encoder's performance. At last, we evaluated two other potential upstream tasks that could help improve the efficiency of facial motion encoding. Through these efforts, we have built a solid benchmark for facial motion representation learning, and the elaborate techniques can inspire other face analysis and video understanding research. biometrics contrastive learning facial motion identity verification neural networks video understanding Engineering

Search results