211 |
[en] ENRICHING AND ANALYZING SEMANTIC TRAJECTORIES WITH LINKED OPEN DATA / [pt] ENRIQUECENDO E ANALISANDO TRAJETÓRIAS SEMÂNTICAS COM DADOS ABERTOS INTERLIGADOSLIVIA COUTO RUBACK RODRIGUES 26 February 2018 (has links)
[pt] Os últimos anos testemunharam o uso crescente de dispositivos que rastreiam objetos móveis: equipamentos com GPS e telefones móveis, veículos ou outros sensores da Internet das Coisas, além de dados de localização de check-ins de redes sociais. Estes dados de mobilidade são representados como trajetórias, e armazenam a sequência de posições de um objeto móvel. Porém, estas sequências
representam somente os dados de posição originais, que precisam ser semanticamente enriquecidos para permitir tarefas de análise e apoiar um entendimento profundo sobre o comportamento do movimento. Um outro espaço de dados global sem precedentes tem crescido rapidamente, a Web de Dados,
graças à iniciativa de Dados Interligados. Estes dados semânticos ricos e livremente disponíveis fornecem uma nova maneira de enriquecer dados de trajetória. Esta tese apresenta contribuições para os desafios que surgem considerando este cenário. Em primeiro lugar, a tese investiga como dados de trajetória podem se beneficiar da iniciativa de dados interligados, guiando todo o processo de enriquecimento semântico utilizando fontes de dados externas. Em segundo lugar, aborda o tópico de computação de similaridade entre entidades representadas como dados interligados com o objetivo de computar a similaridade entre trajetórias semanticamente enriquecidas. A novidade da abordagem apresentada nesta tese consiste em considerar as características relevantes das entidades como listas ranqueadas. Por último, a tese aborda a computação da similaridade entre trajetórias enriquecidas comparando a similaridade entre todas as entidades representadas como dados interligados que representam as trajetórias
enriquecidas. / [en] The last years witnessed a growing number of devices that track moving objects: personal GPS equipped devices and GSM mobile phones, vehicles or other sensors from the Internet of Things but also the location data deriving from the Social Networks check-ins. These mobility data are represented as trajectories, recording the sequence of locations of the moving object. However, these sequences only represent the raw location data and they need to be semantically enriched to be meaningful in the analysis tasks and to support a deep understanding of the movement behavior. Another unprecedented global space that is also growing at a fast pace is the Web of Data, thanks to the emergence of the Linked Data initiative. These freely available semantic rich datasets provide a novel way to enhance trajectory data. This thesis presents a contribution to the many challenges that arise from this scenario. First, it investigates how trajectory data may benefit from the Linked Data Initiative by guiding the whole trajectory enrichment process with the use of external datasets. Then, it addresses the pivotal topic of the similarity computation between Linked Data entities with the final objective of computing the similarity between semantically enriched trajectories. The novelty of our approach is that the thesis considers the relevant entity features as a ranked list. Finally, the thesis targets the computation of the similarity between enriched trajectories by comparing the similarity of the Linked Data entities that represent the enriched trajectories.
|
212 |
A Study On the Mutual Replacements of Three des in Chinese BlogsSha, Hui 07 November 2016 (has links) (PDF)
Three des, as structural particles in Chinese, are phonologically the same, but written differently. Through analyzing the written forms of these three homophonous particles, the research has come to some valuable conclusions that cannot be obtained by only observing the speaking language of Chinese. This paper studies the relationships among three “des” (de1 as “的”; de2 as “地”; de3 as “得”), which function as structural particles in the written language of Chinese, by examining their mutual replacements in blogs. The research regards every living language as a Complex Adaptive System (CAS) with continuing changes. So this study’s perspective not only helps us understand more deeply the structures with three des, but also opens a new window to explore the variation of Chinese on the cognitive linguistic layer, including syntactic and semantic aspects. Through analyzing the authentic data, which is obtained from a corpus built of articles in personal blogs including 400,000 Chinese characters, there are several worthy findings. First, the mutual replacements are asymmetric along with the generalization of de1. Secondly, there is a positive correlation between the frequency of replacements and the linguistic positive relevance among the three des, especially the syntactic and semantic aspects. Finally, the replacements among the vii three des present a diverse and complicated situation when investigating the written forms of idiolects. The syntactic factor plays the main role in the replacements among the three des. The related degree of de1 & de2 is significantly higher than the one of de1 & de3, which is especially obvious on the writers with relatively frequent replacements.
|
213 |
Musical training and semantic integration in sentence processing: Tales of the unexpectedFeatherstone, C.R., Morrison, Catriona M., Waterman, M.G., MacGregor, L.J. January 2014 (has links)
no / Building on models of transfer effects between musical training and language processing and on evidence of similarities in the way the brain responds to unexpected elements in music and language, we investigated whether effects of musical training could be observed at the level of sentence processing. Using sentences that tax the semantic processes involved in natural comprehension and avoid outright anomalies, we showed a striking difference between musicians and non-musicians: contrary to non-musicians, musicians showed no N400 response to novel metaphorical words which were more difficult to integrate semantically into their context than literal controls. This difference between musicians and non-musicians in semantic processing in sentences shows an effect of musicianship at the highest level of music–language transfer effects demonstrated so far in the literature. As well as adding to the growing body of evidence surrounding the relationship between musical training and language processing, this work provides support for theories which suggest shared resources, computations, and neural areas underpinning the high-level processing of music and language.
|
214 |
ImageSI: Interactive Deep Learning for Image Semantic InteractionLin, Jiayue 04 June 2024 (has links)
Interactive deep learning frameworks are crucial for effectively exploring and analyzing complex image datasets in visual analytics. However, existing approaches often face challenges related to inference accuracy and adaptability. To address these issues, we propose ImageSI, a framework integrating deep learning models with semantic interaction techniques for interactive image data analysis. Unlike traditional methods, ImageSI directly incorporates user feedback into the image model, updating underlying embeddings through customized loss functions, thereby enhancing the performance of dimension reduction tasks. We introduce three variations of ImageSI, ImageSI$_{text{MDS}^{-1}}$, prioritizing explicit pairwise relationships from user interaction, and ImageSI$_{text{DRTriplet}}$ and ImageSI$_{text{PHTriplet}}$, emphasizing clustering by defining groups of images based on user input. Through usage scenarios and quantitative analyses centered on algorithms, we demonstrate the superior performance of ImageSI$_{text{DRTriplet}}$ and ImageSI$_{text{MDS}^{-1}}$ in terms of inference accuracy and interaction efficiency. Moreover, ImageSI$_{text{PHTriplet}}$ shows competitive results. The baseline model, WMDS$^{-1}$, generally exhibits lower performance metrics. / Master of Science / Interactive deep learning frameworks are crucial for effectively exploring and analyzing complex image datasets in visual analytics. However, existing approaches often face challenges related to inference accuracy and adaptability. To address these issues, we propose ImageSI, a framework integrating deep learning models with semantic interaction techniques for interactive image data analysis. Unlike traditional methods, ImageSI directly incorporates user feedback into the image model, updating underlying embeddings through customized loss functions, thereby enhancing the performance of dimension reduction tasks. We introduce three variations of ImageSI, ImageSI$_{text{MDS}^{-1}}$, prioritizing explicit pairwise relationships from user interaction, and ImageSI$_{text{DRTriplet}}$ and ImageSI$_{text{PHTriplet}}$, emphasizing clustering by defining groups of images based on user input. Through usage scenarios and quantitative analyses centered on algorithms, we demonstrate the superior performance of ImageSI$_{text{DRTriplet}}$ and ImageSI$_{text{MDS}^{-1}}$ in terms of inference accuracy and interaction efficiency. Moreover, ImageSI$_{text{PHTriplet}}$ shows competitive results. The baseline model, WMDS$^{-1}$, generally exhibits lower performance metrics.
|
215 |
Semantic Web Enabled Composition of Web ServicesMedjahed, Brahim 30 April 2004 (has links)
In this dissertation, we present a novel approach for the automatic composition of Web services on the envisioned Semantic Web. Automatic service composition requires dealing with three major research thrusts: semantic description of Web services, composability of participant services, and generation of composite service descriptions.
This dissertation deals with the aforementioned research issues. We first propose an ontology-based framework for organizing and describing semantic Web services. We introduce the concept of community to cluster Web services based on their domain of interest. Each community is defined as an instance of an ontology called community ontology. We then propose a composability model to check whether semantic Web services can be combined together, hence avoiding unexpected failures at run time. The model defines formal safeguards for meaningful composition through the use of composability rules. We also introduce the notions of composability degree and tau-composability to cater for partial and total composability. Based on the composability model, we propose a set of algorithms that automatically generate detailed descriptions of composite services from high-level specifications of composition requests. We introduce a Quality of Composition (QoC) model to assess the quality of the generated composite services. The techniques presented in this dissertation are implemented in WebDG, a prototype for accessing e-government Web services. Finally, we conduct an extensive performance study (analytical and experimental) of the proposed composition algorithms. / Ph. D.
|
216 |
Narrative Maps: A Computational Model to Support Analysts in Narrative SensemakingKeith Norambuena, Brian Felipe 08 August 2023 (has links)
Narratives are fundamental to our understanding of the world, and they are pervasive in all activities that involve representing events in time. Narrative analysis has a series of applications in computational journalism, intelligence analysis, and misinformation modeling. In particular, narratives are a key element of the sensemaking process of analysts.
In this work, we propose a narrative model and visualization method to aid analysts with this process. In particular, we propose the narrative maps framework—an event-based representation that uses a directed acyclic graph to represent the narrative structure—and a series of empirically defined design guidelines for map construction obtained from a user study.
Furthermore, our narrative extraction pipeline is based on maximizing coherence—modeled as a function of surface text similarity and topical similarity—subject to coverage—modeled through topical clusters—and structural constraints through the use of linear programming optimization. For the purposes of our evaluation, we focus on the news narrative domain and showcase the capabilities of our model through several case studies and user evaluations.
Moreover, we augment the narrative maps framework with interactive AI techniques—using semantic interaction and explainable AI—to create an interactive narrative model that is capable of learning from user interactions to customize the narrative model based on the user's needs and providing explanations for each core component of the narrative model. Throughout this process, we propose a general framework for interactive AI that can handle similar models to narrative maps—that is, models that mix continuous low-level representations (e.g., dimensionality reduction) with more abstract high-level discrete structures (e.g., graphs).
Finally, we evaluate our proposed framework through an insight-based user study. In particular, we perform a quantitative and qualitative assessment of the behavior of users and explore their cognitive strategies, including how they use the explainable AI and semantic interaction capabilities of our system. Our evaluation shows that our proposed interactive AI framework for narrative maps is capable of aiding users in finding more insights from data when compared to the baseline. / Doctor of Philosophy / Narratives are essential to how we understand the world. They help us make sense of events that happen over time. This research focuses on developing a method to assist people, like journalists and analysts, in understanding complex information.
To do this, we introduce a new approach called narrative maps. This model allows us to extract and visualize stories from text data. To improve our model, we use interactive artificial intelligence techniques. These techniques allow our model to learn from user feedback and be customized to fit different needs. We also use these methods to explain how the model works, so users can understand it better.
We evaluate our approach by studying how users interact with it when doing a task with news stories. We consider how useful the system is in helping users gain insights. Our results show that our method aids users in finding important insights compared to traditional methods.
|
217 |
Explainable Interactive Projections for Image DataHan, Huimin 12 January 2023 (has links)
Making sense of large collections of images is difficult. Dimension reductions (DR) assist by organizing images in a 2D space based on similarities, but provide little support for explaining why images were placed together or apart in the 2D space. Additionally, they do not provide support for modifying and updating the 2D space to explore new relationships and organizations of images. To address these problems, we present an interactive DR method for images that uses visual features extracted by a deep neural network to project the images into 2D space and provides visual explanations of image features that contributed to the 2D location. In addition, it allows people to directly manipulate the 2D projection space to define alternative relationships and explore subsequent projections of the images. With an iterative cycle of semantic interaction and explainable-AI feedback, people can explore complex visual relationships in image data. Our approach to human-AI interaction integrates visual knowledge from both human mental models and pre-trained deep neural models to explore image data. Two usage scenarios are provided to demonstrate that our method is able to capture human feedback and incorporate it into the model. Our visual explanations help bridge the gap between the feature space and the original images to illustrate the knowledge learned by the model, creating a synergy between human and machine that facilitates a more complete analysis experience. / Master of Science / High-dimensional data is everywhere. A spreadsheet with many columns, text documents, images, ... ,etc. Exploring and visualizing high-dimensional data can be challenging. Dimension reduction (DR) techniques can help. High dimensional data can be projected into 3d or 2d space and visualized as a scatter plot.Additionally, DR tool can be interactive to help users better explore data and understand underlying algorithms. Designing such interactive DR tool is challenging for images. To address this problem, this thesis presents a tool that can visualize images to a 2D plot, data points that are considered similar are projected close to each other and vice versa. Users can manipulate images directly on this scatterplot-like visualization based on own knowledge to update the display, saliency maps are provided to reflect model's re-projection reasoning.
|
218 |
A Semantic Web-Based Digital Library Infrastructure to Facilitate Computational EpidemiologyHasan, S. M. Shamimul 15 September 2017 (has links)
Computational epidemiology generates and utilizes massive amounts of data. There are two primary categories of datasets: reported and synthetic. Reported data include epidemic data published by organizations (e.g., WHO, CDC, other national ministries and departments of health) during and following actual outbreaks, while synthetic datasets are comprised of spatially explicit synthetic populations, labeled social contact networks, multi-cell statistical experiments, and output data generated from the execution of computer simulation experiments. The discipline of computational epidemiology encounters numerous challenges because of the size, volume, and dynamic nature of both types of these datasets.
In this dissertation, we present semantic web-based schemas to organize diverse reported and synthetic computational epidemiology datasets. There are three layers of these schemas: conceptual, logical, and physical. The conceptual layer provides data abstraction by exposing common entities and properties to the end user. The logical layer captures data fragmentation and linking aspects of the datasets. The physical layer covers storage aspects of the datasets. We can create mapping files from the schemas. The schemas are flexible and can grow.
The schemas presented include data linking approaches that can connect large-scale and widely varying epidemic datasets. This linked data leads to an integrated knowledge-base, enabling an epidemiologist to ask complex queries that employ multiple datasets. We demonstrate the utility of our knowledge-base by developing a query bank, which represents typical analyses carried out by an epidemiologist during the course of planning for or responding to an epidemic. By running queries with different data mapping techniques, we demonstrate the performance of various tools. The empirical results show that leveraging semantic web technology is an effective strategy for: reasoning over multiple datasets simultaneously, developing network queries pertinent in an epidemic analysis, and conducting realistic studies undertaken in an epidemic investigation. The performance of queries varies according to the choice of hardware, underlying database, and resource description framework (RDF) engine. We provide application programming interfaces (APIs) on top of our linked datasets, which an epidemiologist can use for information retrieval, without knowing much about underlying datasets. The proposed semantic web-based digital library infrastructure can be highly beneficial for epidemiologists as they work to comprehend disease propagation for timely outbreak detection and efficient disease control activities. / PHD / Computational epidemiology generates and utilizes massive amounts of data, and the field faces numerous challenges because of the volume and dynamic nature of the datasets utilized. There are two primary categories of datasets. The first contains epidemic datasets tracking actual outbreaks of disease, which are reported by governments, private companies, and associated parties. The second category is synthetic data created through computer simulation. We present semantic web-based schemas to organize diverse reported and synthetic computational epidemiology datasets. The schemas are flexible in use and scale, and utilize data linking approaches that can connect large-scale and widely varying epidemic datasets. This linked data leads to an integrated knowledge-base, enabling an epidemiologist to ask complex queries that employ multiple datasets. This ability helps epidemiologists better understand disease propagation, for efficient outbreak detection and disease control activities.
|
219 |
Comparing Latent Dirichlet Allocation and Latent Semantic Analysis as ClassifiersAnaya, Leticia H. 12 1900 (has links)
In the Information Age, a proliferation of unstructured text electronic documents exists. Processing these documents by humans is a daunting task as humans have limited cognitive abilities for processing large volumes of documents that can often be extremely lengthy. To address this problem, text data computer algorithms are being developed. Latent Semantic Analysis (LSA) and Latent Dirichlet Allocation (LDA) are two text data computer algorithms that have received much attention individually in the text data literature for topic extraction studies but not for document classification nor for comparison studies. Since classification is considered an important human function and has been studied in the areas of cognitive science and information science, in this dissertation a research study was performed to compare LDA, LSA and humans as document classifiers. The research questions posed in this study are: R1: How accurate is LDA and LSA in classifying documents in a corpus of textual data over a known set of topics? R2: How accurate are humans in performing the same classification task? R3: How does LDA classification performance compare to LSA classification performance? To address these questions, a classification study involving human subjects was designed where humans were asked to generate and classify documents (customer comments) at two levels of abstraction for a quality assurance setting. Then two computer algorithms, LSA and LDA, were used to perform classification on these documents. The results indicate that humans outperformed all computer algorithms and had an accuracy rate of 94% at the higher level of abstraction and 76% at the lower level of abstraction. At the high level of abstraction, the accuracy rates were 84% for both LSA and LDA and at the lower level, the accuracy rate were 67% for LSA and 64% for LDA. The findings of this research have many strong implications for the improvement of information systems that process unstructured text. Document classifiers have many potential applications in many fields (e.g., fraud detection, information retrieval, national security, and customer management). Development and refinement of algorithms that classify text is a fruitful area of ongoing research and this dissertation contributes to this area.
|
220 |
How We Learn: The Importance of Semantics to Learning in a Known WorldTraver, Nicholas Kirby January 2024 (has links)
Thesis advisor: Lucas Coffman / This thesis explores the importance of semantic (specific-containing) information in learning as the amount of easily recognizable information increases. This study emulates the advertising industry, applying relevance to its findings. Through a randomized experiment, I find significant evidence that the increased frequency of new brands harms the memory of easily identifiable brands. I also find evidence that suggests that semantically presented new brands are more often remembered than episodically (story-based) presented new brands. Additionally, I observed directional but insignificant results suggesting that the effectiveness of semantic vs. episodic information on the identification of new brands is greatest as the frequency of easily identifiable brands increases and the quantity of semantically presented brands decreases. Despite the benefit that presenting information semantically has on remembering new brands, my findings suggest that people do not retain the specifics within semantically presented impressions. / Thesis (BA) — Boston College, 2024. / Submitted to: Boston College. Morrissey School of Arts and Sciences. / Discipline: Economics. / Discipline: Departmental Honors.
|
Page generated in 0.0402 seconds