131 |
Advanced Intranet Search EngineNarayan, Nitesh January 2009 (has links)
<p>Information retrieval has been a prevasive part of human society since its existence.With the advent of internet and World wide Web it became an extensive area of researchand major foucs, which lead to development of various search engines to locate the de-sired information, mostly for globally connected computer networks viz. internet.Butthere is another major part of computer network viz. intranet, which has not seen muchof advancement in information retrieval approaches, in spite of being a major source ofinformation within a large number of organizations.Most common technique for intranet based search engines is still mere database-centric. Thus practically intranets are unable to avail the benefits of sophisticated tech-niques that have been developed for internet based search engines without exposing thedata to commercial search engines.In this Master level thesis we propose a ”state of the art architecture” for an advancedsearch engine for intranet which is capable of dealing with continuously growing sizeof intranets knowledge base. This search engine employs lexical processing of doc-umetns,where documents are indexed and searched based on standalone terms or key-words, along with the semantic processing of the documents where the context of thewords and the relationship among them is given more importance.Combining lexical and semantic processing of the documents give an effective ap-proach to handle navigational queries along with research queries, opposite to the modernsearch engines which either uses lexical processing or semantic processing (or one as themajor) of the documents. We give equal importance to both the approaches in our design,considering best of the both world.This work also takes into account various widely acclaimed concepts like inferencerules, ontologies and active feedback from the user community to continuously enhanceand improve the quality of search results along with the possibility to infer and deducenew knowledge from the existing one, while preparing for the advent of semantic web.</p>
|
132 |
Knowledge Integration to Overcome Ontological Heterogeneity: Challenges from Financial Information SystemsFirat, Aykut, Madnick, Stuart E., Grosof, Benjamin 01 1900 (has links)
The shift towards global networking brings with it many opportunities and challenges. In this paper, we discuss key technologies in achieving global semantic interoperability among heterogeneous information systems, including both traditional and web data sources. In particular, we focus on the importance of this capability and technologies we have designed to overcome ontological heterogeneity, a common type of disparity in financial information systems. Our approach to representing and reasoning with ontological heterogeneities in data sources is an extension of the Context Interchange (COIN) framework, a mediator-based approach for achieving semantic interoperability among heterogeneous sources and receivers. We also analyze the issue of ontological heterogeneity in the context of source-selection, and offer a declarative solution that combines symbolic solvers and mixed integer programming techniques in a constraint logic-programming framework. Finally, we discuss how these techniques can be coupled with emerging Semantic Web related technologies and standards such as Web-Services, DAML+OIL, and RuleML, to offer scalable solutions for global semantic interoperability. We believe that the synergy of database integration and Semantic Web research can make significant contributions to the financial knowledge integration problem, which has implications in financial services, and many other e-business tasks. / Singapore-MIT Alliance (SMA)
|
133 |
Exploration des mécanismes non conscients de la perception de la parole: approches comportementales et électroencéphalographiques / Exploration of non-conscious mechanisms involved in speech perception: Evidence from behavioral and electroencephalographic studiesSignoret, Carine January 2010 (has links)
Although a lot of information is available from our environment at every moment, only a small part gives rise to a conscious percept. It is then legitimate to wonder which mechanisms are involved in the perception phenomenon. On the basis of which processes will a sensory stimulation be perceived consciously? What happens to the stimulations that are not consciously perceived? The work presented in this thesis aims to bring some elements of response to these two questions in the auditory modality. Through different behavioral and electroencephalographic studies, we suggest that knowledge could have a top-down facilitatory influence on high-level as well as on low-level (like detection) processing of complex auditory stimulations. The stimulations we have some knowledge about (phonologic or semantic) are more easily detected than the stimulations that contain neither phonologic nor semantic information. We also show that the activation of the knowledge influences the perception of subsequent stimulations, even when the context is not perceived consciously. This is evidenced by a subliminal semantic priming effect and by modifications of the neural oscillations in the beta frequency band associated with lexical processing of stimulations that were not consciously categorized. Hence, auditory perception can be considered as the product of the continuous interaction between the context set by the environment and the knowledge one has about specific stimuli. Such an interaction would lead listeners to preferentially perceive what they already know. / Tandis que de nombreuses informations sont disponibles dans notre environnement à chaque instant, toutes ne donnent pas lieu à une perception consciente. Il est alors légitime de se demander quels mécanismes entrent en jeu dans le phénomène de perception. Sur la base de quels processus une stimulation sensorielle sera-t-elle perçue de façon consciente ? Que deviennent les stimulations qui ne sont pas perçues consciemment ? Ce présent travail de thèse vise à apporter des éléments de réponse à ces deux questions dans la modalité auditive. À travers plusieurs études utilisant des approches comportementales mais aussi électroencéphalographiques, nous suggérons que les connaissances pourraient exercer une influence top-down facilitant les hauts comme les bas niveaux de traitement (comme la détection) des stimulations auditives complexes. Les stimulations pour lesquelles nous avons des connaissances (phonologiques et sémantiques) sont mieux détectées que les stimulations ne contenant ni caractéristique phonologique ni caractéristique sémantique. Nous montrons également que l'activation des connaissances influence la perception des stimulations ultérieures, et ce, même lorsque le contexte n'est pas perçu consciemment. En effet nous avons pu mettre en évidence un effet d'amorçage sémantique subliminal et nous avons observé des modifications neuronales oscillatoires dans la bande de fréquence bêta concomitante au traitement lexical de stimulations non catégorisées consciemment. L'ensemble des perceptions auditives ne serait alors que le produit d'une interaction permanente entre le contexte environnemental et les connaissances, ce qui nous conduirait à percevoir préférentiellement ce que nous connaissons déjà.
|
134 |
A longitudinal study of semantic memory impairment in patients with Alzheimer’s diseaseMårdh, Selina, Nägga, Katarina, Samuelsson, Stefan January 2013 (has links)
Introduction The present study explored the nature of the semantic deterioration normally displayed in the course of Alzheimer’s disease (AD). The aim was to disentangle the extent to which semantic memory problems in patients with AD are best characterized as loss of semantic knowledge rather than difficulties in accessing semantic knowledge. Method A longitudinal approach was applied. The same semantic tests as well as same items were used across three test occasions a year apart. Twelve Alzheimer patients and 20 matched control subjects, out of a total of 25 cases in each group, remained at the final test occasion. Results and Conclusions Alzheimer patients were impaired in all the semantic tasks as compared to the matched comparison group. A progressing deterioration was evident during the study period. Our findings suggest that semantic impairment is mainly due to loss of information rather than problems in accessing semantic information.
|
135 |
Novelty Detection by Latent Semantic IndexingZhang, Xueshan January 2013 (has links)
As a new topic in text mining, novelty detection is a natural extension of information retrieval systems, or search engines. Aiming at refining raw search results by filtering out old news and saving only the novel messages, it saves modern people from the nightmare of information overload. One of the difficulties in novelty detection is the inherent ambiguity of language, which is the carrier of information. Among the sources of ambiguity, synonymy proves to be a notable factor. To address this issue, previous studies mainly employed WordNet, a lexical database which can be perceived as a thesaurus. Rather than borrowing a dictionary, we proposed a statistical approach employing Latent Semantic Indexing (LSI) to learn semantic relationship automatically with the help of language resources.
To apply LSI which involves matrix factorization, an immediate problem is that the dataset in novelty detection is dynamic and changing constantly. As an imitation of real-world scenario, texts are ranked in chronological order and examined one by one. Each text is only compared with those having appeared earlier, while later ones remain unknown. As a result, the data matrix starts as a one-row vector representing the first report, and has a new row added at the bottom every time we read a new document. Such a changing dataset makes it hard to employ matrix methods directly. Although LSI has long been acknowledged as an effective text mining method when considering semantic structure, it has never been used in novelty detection, nor have other statistical treatments. We tried to change this situation by introducing external text source to build the latent semantic space, onto which the incoming news vectors were projected.
We used the Reuters-21578 dataset and the TREC data as sources of latent semantic information. Topics were divided into years and types in order to take the differences between them into account. Results showed that LSI, though very effective in traditional information retrieval tasks, had only a slight improvement to the performances for some data types. The extent of improvement depended on the similarity between news data and external information. A probing into the co-occurrence matrix attributed such a limited performance to the unique features of microblogs. Their short sentence lengths and restricted dictionary made it very hard to recover and exploit latent semantic information via traditional data structure.
|
136 |
Semantic social routing in GnutellaUpadrashta, Yamini 18 February 2005
The objective of this project is to improve the performance of the Gnutella peer-to-peer protocol (version 0.4) by introducing a semantic-social routing model and several categories of interest. The Gnutella protocol requires peers to broadcast messages to their neighbours when they search files. The message passing generates a lot of traffic in the network, which degrades the quality of service. We propose using social networks to optimize the speed of search and to improve the quality of service in a Gnutella based peer-to-peer environment. Each peer creates and updates a friends list from its past experience, for each category of interest. Once peers generate their friends lists, they use these lists to semantically route queries in the network. Search messages in a given category are mainly sent to friends who have been useful in the past in finding files in the same category. This helps to reduce the search time and to decrease the network traffic by minimizing the number of messages circulating in the system as compared to standard Gnutella. This project will demonstrate by simulating a peer-to-peer type of environment with the JADE multi-agent system platform that by learning other peers interests, building and exploiting their social networks (friends lists) to route queries semantically, peers can get more relevant resources faster and with less traffic generated, i.e. that the performance of the Gnutella system can be improved.
|
137 |
The Role of Function, Homogeneity and Syntax in Creative Performance on the Uses of Objects TaskForster, Evelyn 24 February 2009 (has links)
The Uses of Objects Task is a widely used assessment of creative performance, but it relies on subjective scoring methods for evaluation. A new version of the task was devised using Latent Semantic Analysis (LSA), a computational tool used to measure semantic distance. 135 participants provided as many creative uses for as they could for 20 separate objects. Responses were analyzed for strategy use, category switching, variety, and originality of responses, as well as subjective measure of creativity by independent raters. The LSA originality measure was more reliable than the subjective measure, and values averaged over participants correlated with both subjective evaluations and self-assessment of creativity. The score appeared to successfully isolate the creativity of the people themselves, rather than the potential creativity afforded by a given object.
|
138 |
The Role of Function, Homogeneity and Syntax in Creative Performance on the Uses of Objects TaskForster, Evelyn 24 February 2009 (has links)
The Uses of Objects Task is a widely used assessment of creative performance, but it relies on subjective scoring methods for evaluation. A new version of the task was devised using Latent Semantic Analysis (LSA), a computational tool used to measure semantic distance. 135 participants provided as many creative uses for as they could for 20 separate objects. Responses were analyzed for strategy use, category switching, variety, and originality of responses, as well as subjective measure of creativity by independent raters. The LSA originality measure was more reliable than the subjective measure, and values averaged over participants correlated with both subjective evaluations and self-assessment of creativity. The score appeared to successfully isolate the creativity of the people themselves, rather than the potential creativity afforded by a given object.
|
139 |
A Semi-Supervised Approach to the Construction of Semantic LexiconsAhmadi, Mohamad Hasan 14 March 2012 (has links)
A growing number of applications require dictionaries of words belonging to semantic classes present in specialized domains. Manually constructed knowledge bases often do not provide sufficient coverage of specialized vocabulary and require substantial effort to build and keep up-to-date. In this thesis, we propose a semi-supervised approach to the construction of domain-specific semantic lexicons based on the distributional similarity hypothesis. Our method starts with a small set of seed words representing the target class and an unannotated text corpus. It locates instances of seed words in the text and generates lexical patterns from their contexts; these patterns in turn extract more words/phrases that belong to the semantic category in an iterative manner. This bootstrapping process can be continued until the output lexicon reaches the desired size.
We explore employing techniques such as learning lexicons for multiple semantic classes at the same time and using feedback from competing lexicons to increase the learning precision. Evaluated for extraction of dish names and subjective adjectives from a corpus of restaurant reviews, our approach demonstrates great flexibility in learning various word classes, and also performance improvements over state of the art bootstrapping and distributional similarity techniques for the extraction of semantically similar words. Its shallow lexical patterns also prove to perform superior to syntactic patterns in capturing the semantic class of words.
|
140 |
Advanced Intranet Search EngineNarayan, Nitesh January 2009 (has links)
Information retrieval has been a prevasive part of human society since its existence.With the advent of internet and World wide Web it became an extensive area of researchand major foucs, which lead to development of various search engines to locate the de-sired information, mostly for globally connected computer networks viz. internet.Butthere is another major part of computer network viz. intranet, which has not seen muchof advancement in information retrieval approaches, in spite of being a major source ofinformation within a large number of organizations.Most common technique for intranet based search engines is still mere database-centric. Thus practically intranets are unable to avail the benefits of sophisticated tech-niques that have been developed for internet based search engines without exposing thedata to commercial search engines.In this Master level thesis we propose a ”state of the art architecture” for an advancedsearch engine for intranet which is capable of dealing with continuously growing sizeof intranets knowledge base. This search engine employs lexical processing of doc-umetns,where documents are indexed and searched based on standalone terms or key-words, along with the semantic processing of the documents where the context of thewords and the relationship among them is given more importance.Combining lexical and semantic processing of the documents give an effective ap-proach to handle navigational queries along with research queries, opposite to the modernsearch engines which either uses lexical processing or semantic processing (or one as themajor) of the documents. We give equal importance to both the approaches in our design,considering best of the both world.This work also takes into account various widely acclaimed concepts like inferencerules, ontologies and active feedback from the user community to continuously enhanceand improve the quality of search results along with the possibility to infer and deducenew knowledge from the existing one, while preparing for the advent of semantic web.
|
Page generated in 0.3557 seconds