• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 6
  • 4
  • 2
  • 2
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • Tagged with
  • 23
  • 23
  • 7
  • 7
  • 6
  • 6
  • 5
  • 5
  • 4
  • 4
  • 4
  • 4
  • 4
  • 3
  • 3
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Querying, Exploring and Mining the Extended Document

Sarkas, Nikolaos 31 August 2011 (has links)
The evolution of the Web into an interactive medium that encourages active user engagement has ignited a huge increase in the amount, complexity and diversity of available textual data. This evolution forces us to re-evaluate our view of documents as simple pieces of text and of document collections as immutable and isolated. Extended documents published in the context of blogs, micro-blogs, on-line social networks, customer feedback portals, can be associated with a wealth of meta-data in addition to their textual component: tags, links, sentiment, entities mentioned in text, etc. Collections of user-generated documents grow, evolve, co-exist and interact: they are dynamic and integrated. These unique characteristics of modern documents and document collections present us with exciting opportunities for improving the way we interact with them. At the same time, this additional complexity combined with the vast amounts of available textual data present us with formidable computational challenges. In this context, we introduce, study and extensively evaluate an array of effective and efficient solutions for querying, exploring and mining extended documents, dynamic and integrated document collections. For collections of socially annotated extended documents, we present an improved probabilistic search and ranking approach based on our growing understanding of the dynamics of the social annotation process. For extended documents, such as blog posts, associated with entities extracted from text and categorical attributes, we enable their interactive exploration through the efficient computation of strong entity associations. Associated entities are computed for all possible attribute value restrictions of the document collection. For extended documents, such as user reviews, annotated with a numerical rating, we introduce a keyword-query refinement approach. The solution enables the interactive navigation and exploration of large result sets. We extend the skyline query to document streams, such as news articles, associated with categorical attributes and partially ordered domains. The technique incrementally maintains a small set of recent, uniquely interesting extended documents from the stream.Finally, we introduce a solution for the scalable integration of structured data sources into Web search. Queries are analysed in order to determine what structured data, if any, should be used to augment Web search results.
2

Querying, Exploring and Mining the Extended Document

Sarkas, Nikolaos 31 August 2011 (has links)
The evolution of the Web into an interactive medium that encourages active user engagement has ignited a huge increase in the amount, complexity and diversity of available textual data. This evolution forces us to re-evaluate our view of documents as simple pieces of text and of document collections as immutable and isolated. Extended documents published in the context of blogs, micro-blogs, on-line social networks, customer feedback portals, can be associated with a wealth of meta-data in addition to their textual component: tags, links, sentiment, entities mentioned in text, etc. Collections of user-generated documents grow, evolve, co-exist and interact: they are dynamic and integrated. These unique characteristics of modern documents and document collections present us with exciting opportunities for improving the way we interact with them. At the same time, this additional complexity combined with the vast amounts of available textual data present us with formidable computational challenges. In this context, we introduce, study and extensively evaluate an array of effective and efficient solutions for querying, exploring and mining extended documents, dynamic and integrated document collections. For collections of socially annotated extended documents, we present an improved probabilistic search and ranking approach based on our growing understanding of the dynamics of the social annotation process. For extended documents, such as blog posts, associated with entities extracted from text and categorical attributes, we enable their interactive exploration through the efficient computation of strong entity associations. Associated entities are computed for all possible attribute value restrictions of the document collection. For extended documents, such as user reviews, annotated with a numerical rating, we introduce a keyword-query refinement approach. The solution enables the interactive navigation and exploration of large result sets. We extend the skyline query to document streams, such as news articles, associated with categorical attributes and partially ordered domains. The technique incrementally maintains a small set of recent, uniquely interesting extended documents from the stream.Finally, we introduce a solution for the scalable integration of structured data sources into Web search. Queries are analysed in order to determine what structured data, if any, should be used to augment Web search results.
3

Language Identification on Short Textual Data

Cui, Yexin January 2020 (has links)
Language identification is the task of automatically detecting the languages(s) written in a text or a document given, and is also the very first step of further natural language processing tasks. This task has been well-studied over decades in the past, however, most of the works have focused on long texts rather than the short that is proved to be more challenging due to the insufficiency of syntactic and semantic information. In this work, we present approaches to this problem based on deep learning techniques, traditional methods and their combination. The proposed ensemble model, composed of a learning based method and a dictionary based method, achieves 89.6% accuracy on our new generated gold test set, surpassing Google Translate API by 3.7% and an industry leading tool Langid.py by 26.1%. / Thesis / Master of Applied Science (MASc)
4

Extracting Structured Knowledge from Textual Data in Software Repositories

Hasan, Maryam 06 1900 (has links)
Software team members, as they communicate and coordinate their work with others throughout the life-cycle of their projects, generate different kinds of textual artifacts. Despite the variety of works in the area of mining software artifacts, relatively little research has focused on communication artifacts. Software communication artifacts, in addition to source code artifacts, contain useful semantic information that is not fully explored by existing approaches. This thesis, presents the development of a text analysis method and tool to extract and represent useful pieces of information from a wide range of textual data sources associated with software projects. Our text analysis system integrates Natural Language Processing techniques and statistical text analysis methods, with software domain knowledge. The extracted information is represented as RDF-style triples which constitute interesting relations between developers and software products. We applied the developed system to analyze five different textual information, i.e., source code commits, bug reports, email messages, chat logs, and wiki pages. In the evaluation of our system, we found its precision to be 82%, its recall 58%, and its F-measure 68%.
5

Extracting Structured Knowledge from Textual Data in Software Repositories

Hasan, Maryam Unknown Date
No description available.
6

Kierkegaard and the computer : some recent contributions

Hogue, Stéphane January 1990 (has links)
This document is submitted with the permission and encouragement of the department of philosophy of McGill University in lieu of a conventional thesis. Briefly, it consists of a combined account and selective historical review of some uses of the computer in philosophy, and of a partial list of my computer-related contributions to Kierkegaard scholarship. The former deals generally with the creation, interrogation and analysis of machine-readable forms of philosophical texts. The latter deals specifically with my own work of creating and analyzing Kierkegaard-related machine-readable texts.
7

災害のイマジネーション力に関する探索的研究 - 大学生の想像力と阪神淡路大震災の事例との比較 -

元吉, 忠寛, MOTOYOSHI, Tadahiro 20 April 2006 (has links)
国立情報学研究所で電子化したコンテンツを使用している。
8

Kierkegaard and the computer : some recent contributions

Hogue, Stéphane January 1990 (has links)
No description available.
9

EMOTION DISCOVERY IN HINDI-ENGLISH CODE-MIXED CONVERSATIONS

Monika Vyas (18431835) 28 April 2024 (has links)
<p dir="ltr">This thesis delves into emotion recognition in Hindi-English code-mixed dialogues, particularly focusing on romanized text, which is essential for understanding multilingual communication dynamics. Using a dataset from bilingual television shows, the study employs machine learning and natural language processing techniques, with models like Support Vector Machine, Logistic Regression, and XLM-Roberta tailored to handle the nuances of code-switching and transliteration in romanized Hindi-English. To combat challenges such as data imbalance, SMOTE (Synthetic Minority Over-sampling Technique) is utilized, enhancing model training and generalization. The research also explores ensemble learning with methods like VotingClassifier to improve emotional classification accuracy. Logistic regression stands out for its high accuracy and robustness, demonstrated through rigorous cross-validation. The findings underscore the potential of advanced machine learning models and advocate for further exploration of deep learning and multimodal data to enhance emotion detection in diverse linguistic settings.</p>
10

Exploring the drivers of customers’ brand attitudes of online travel agency services: A text-mining based approach

Ray, A., Bala, P.K., Rana, Nripendra P. 14 February 2021 (has links)
Yes / This paper aims to explore the important qualitative aspects of online user-generated-content that reflects customers’ brand-attitudes. Additionally, the qualitative aspects can help service-providers understand customers’ brand-attitudes by focusing on the important aspects rather than reading the entire review, which will save both their time and effort. We have utilised a total of 10,000 reviews from TripAdvisor (an online-travel-agency provider). This study has analysed the data using statistical-technique (logistic regression), predictive-model (artificial-neural-networks) and structural-modelling technique to understand the most important aspects (i.e. sentiment, emotion or parts-of-speech) that can help to predict customers’ brand-attitudes. Results show that sentiment is the most important aspect in predicting brand-attitudes. While total sentiment content and content polarity have significant positive association, negative high-arousal emotions and low-arousal emotions have significant negative association with customers’ brand attitudes. However, parts-of-speech aspects have no significant impact on brand attitude. The paper concludes with implications, limitations and future research directions.

Page generated in 0.062 seconds