311 |
Evaluation and improvement of semantically-enhanced tagging systemAlsharif, Majdah Hussain January 2013 (has links)
The Social Web or ‘Web 2.0’ is focused on the interaction and collaboration between web sites users. It is credited for the existence of tagging systems, amongst other things such as blogs and Wikis. Tagging systems like YouTube and Flickr offer their users the simplicity and freedom in creating and sharing their own contents and thus folksonomy is a very active research area where many improvements are presented to overcome existing disadvantages such as the lack of semantic meaning, ambiguity, and inconsistency. TE is a tagging system proposing solutions to the problems of multilingualism, lack of semantic meaning and shorthand writing (which is very common in the social web) through the aid of semantic and social resources. The current research is presenting an addition to the TE system in the form of an embedded stemming component to provide a solution to the different lexical form problems. Prior to this, the TE system had to be explored thoroughly and then its efficiency had to be determined in order to decide on the practicality of embedding any additional components as enhancements to the performance. Deciding on this involved analysing the algorithm efficiency using an analytical approach to determine its time and space complexity. The TE had a time growth rate of O (N²) which is polynomial, thus the algorithm is considered efficient. Nonetheless, recommended modifications like patch SQL execution can improve this. Regarding space complexity, the number of tags per photo represents the problem size which, if it grows, will increase linearly the required memory space. Based on the findings above, the TE system is re-implemented on Flickr instead of YouTube, because of a recent YouTube restriction, which is of greater benefit in multi languages tagging system since the language barrier is meaningless in this case. The re-implementation is achieved using ‘flickrj’ (Java Interface for Flickr APIs). Next, the stemming component is added to perform tags normalisation prior to the ontologies querying. The component is embedded using the Java encoding of the porter 2 stemmer which support many languages including Italian. The impact of the stemming component on the performance of the TE system in terms of the size of the index table and the number of retrieved results is investigated using an experiment that showed a reduction of 48% in the size of the index table. This also means that search queries have less system tags to compare them against the search keywords and this can speed up the search. Furthermore, the experiment runs similar search trails on two versions of the TE systems one without the stemming component and the other with the stemming component and found out that the latter produced more results on the conditions of working with valid words and valid stems. The embedding of the stemming component in the new TE system has lessened the effect of the storage overhead needed for the generated system tags by their reduction for the size of the index table which make the system suited for many applications such as text classification, summarization, email filtering, machine translation…etc.
|
312 |
Design of Business Process Model Repositories : Requirements, Semantic Annotation Model and Relationship Meta-modelElias, Mturi January 2015 (has links)
Business process management is fast becoming one of the most important approaches for designing contemporary organizations and information systems. A critical component of business process management is business process modelling. It is widely accepted that modelling of business processes from scratch is a complex, time-consuming and error-prone task. However the efforts made to model these processes are seldom reused beyond their original purpose. Reuse of business process models has the potential to overcome the challenges of modelling business processes from scratch. Process model repositories, properly populated, are certainly a step toward supporting reuse of process models. This thesis starts with the observation that the existing process model repositories for supporting process model reuse suffer from several shortcomings that affect their usability in practice. Firstly, most of the existing repositories are proprietary, therefore they can only be enhanced or extended with new models by the owners of the repositories. Secondly, it is difficult to locate and retrieve relevant process models from a large collection. Thirdly, process models are not goal related, thereby making it difficult to gain an understanding of the business goals that are realized by a certain model. Finally, process model repositories lack a clear mechanism to identify and define the relationship between business processes and as a result it is difficult to identify related processes. Following a design science research paradigm, this thesis proposes an open and language-independent process model repository with an efficient retrieval system to support process model reuse. The proposed repository is grounded on four original and interrelated contributions: (1) a set of requirements that a process model repository should possess to increase the probability of process model reuse; (2) a context-based process semantic annotation model for semantically annotating process models to facilitate effective retrieval of process models; (3) a business process relationship meta-model for identifying and defining the relationship of process models in the repository; and (4) architecture of a process model repository for process model reuse. The models and architecture produced in this thesis were evaluated to test their utility, quality and efficacy. The semantic annotation model was evaluated through two empirical studies using controlled experiments. The conclusion drawn from the two studies is that the annotation model improves searching, navigation and understanding of process models. The process relationship meta-model was evaluated using an informed argument to determine the extent to which it meets the established requirements. The results of the analysis revealed that the meta-model meets the established requirements. Also the analysis of the architecture against the requirements indicates that the architecture meets the established requirements. / Processhantering, också kallat ärendehantering, har blivit en av de viktigaste ansatserna för att utforma dagens organisationer och informationssystem. En central komponent i processhantering är processmodellering. Det är allmänt känt att modellering av processer kan vara en komplex, tidskrävande och felbenägen uppgift. Och de insatser som görs för att modellera processer kan sällan användas bortom processernas ursprungliga syfte. Återanvändning av processmodeller skulle kunna övervinna många av de utmaningar som finns med att modellera processer. En katalog över processmodeller är ett steg mot att stödja återanvändning av processmodeller. Denna avhandling börjar med observationen att befintliga processmodellkataloger för att stödja återanvändning av processmodeller lider av flera brister som påverkar deras användbarhet i praktiken. För det första är de flesta processmodellkatalogerna proprietära, och därför kan endast katalogägarna förbättra eller utöka dem med nya modeller. För det andra är det svårt att finna och hämta relevanta processmodeller från en stor katalog. För det tredje är processmodeller inte målrelaterade, vilket gör det svårt att få en förståelse för de affärsmål som realiseras av en viss modell. Slutligen så saknar processmodellkataloger ofta en tydlig mekanism för att identifiera och definiera förhållandet mellan processer, och därför är det svårt att identifiera relaterade processer. Utifrån ett designvetenskapligt forskningsparadigm så föreslår denna avhandling en öppen och språkoberoende processmodellkatalog med ett effektivt söksystem för att stödja återanvändning av processmodeller. Den föreslagna katalogen bygger på fyra originella och inbördes relaterade bidrag: (1) en uppsättning krav som en processmodellkatalog bejöver uppfylla för att öka möjligheterna till återanvändning av processmodeller; (2) en kontextbaserad semantisk processannoteringsmodell för semantisk annotering av processmodeller för att underlätta effektivt återvinnande av processmodeller; (3) en metamodell för processrelationer för att identifiera och definiera förhållandet mellan processmodeller i katalogen; och (4) en arkitektur av en processmodellkatalog för återanvändning av processmodeller. De modeller och den arkitektur som tagits fram i denna avhandling har utvärderats för att testa deras användbarhet, kvalitet och effektivitet. Den semantiska annotationsmodellen utvärderades genom två empiriska studier med kontrollerade experiment. Slutsatsen av de två studierna är att modellen förbättrar sökning, navigering och förståelse för processmodeller. Metamodellen för processrelationer utvärderades med hjälp av ett informerat argument för att avgöra i vilken utsträckning den uppfyllde de ställda kraven. Resultaten av analysen visade att metamodellen uppfyllde dessa krav. Även analysen av arkitekturen indikerade att denna uppfyllde de fastställda kraven.
|
313 |
Semantic Analysis in Web Usage MiningNorguet, Jean-Pierre E 20 March 2006 (has links)
With the emergence of the Internet and of the World Wide Web, the Web site has become a key communication channel in organizations. To satisfy the objectives of the Web site and of its target audience, adapting the Web site content to the users' expectations has become a major concern. In this context, Web usage mining, a relatively new research area, and Web analytics, a part of Web usage mining that has most emerged in the corporate world, offer many Web communication analysis techniques. These techniques include prediction of the user's behaviour within the site, comparison between expected and actual Web site usage, adjustment of the Web site with respect to the users' interests, and mining and analyzing Web usage data to discover interesting metrics and usage patterns. However, Web usage mining and Web analytics suffer from significant drawbacks when it comes to support the decision-making process at the higher levels in the organization.
Indeed, according to organizations theory, the higher levels in the organizations need summarized and conceptual information to take fast, high-level, and effective decisions. For Web sites, these levels include the organization managers and the Web site chief editors. At these levels, the results produced by Web analytics tools are mostly useless. Indeed, most of these results target Web designers and Web developers. Summary reports like the number of visitors and the number of page views can be of some interest to the organization manager but these results are poor. Finally, page-group and directory hits give the Web site chief editor conceptual results, but these are limited by several problems like page synonymy (several pages contain the same topic), page polysemy (a page contains several topics), page temporality, and page volatility.
Web usage mining research projects on their part have mostly left aside Web analytics and its limitations and have focused on other research paths. Examples of these paths are usage pattern analysis, personalization, system improvement, site structure modification, marketing business intelligence, and usage characterization. A potential contribution to Web analytics can be found in research about reverse clustering analysis, a technique based on self-organizing feature maps. This technique integrates Web usage mining and Web content mining in order to rank the Web site pages according to an original popularity score. However, the algorithm is not scalable and does not answer the page-polysemy, page-synonymy, page-temporality, and page-volatility problems. As a consequence, these approaches fail at delivering summarized and conceptual results.
An interesting attempt to obtain such results has been the Information Scent algorithm, which produces a list of term vectors representing the visitors' needs. These vectors provide a semantic representation of the visitors' needs and can be easily interpreted. Unfortunately, the results suffer from term polysemy and term synonymy, are visit-centric rather than site-centric, and are not scalable to produce. Finally, according to a recent survey, no Web usage mining research project has proposed a satisfying solution to provide site-wide summarized and conceptual audience metrics.
In this dissertation, we present our solution to answer the need for summarized and conceptual audience metrics in Web analytics. We first described several methods for mining the Web pages output by Web servers. These methods include content journaling, script parsing, server monitoring, network monitoring, and client-side mining. These techniques can be used alone or in combination to mine the Web pages output by any Web site. Then, the occurrences of taxonomy terms in these pages can be aggregated to provide concept-based audience metrics. To evaluate the results, we implement a prototype and run a number of test cases with real Web sites.
According to the first experiments with our prototype and SQL Server OLAP Analysis Service, concept-based metrics prove extremely summarized and much more intuitive than page-based metrics. As a consequence, concept-based metrics can be exploited at higher levels in the organization. For example, organization managers can redefine the organization strategy according to the visitors' interests. Concept-based metrics also give an intuitive view of the messages delivered through the Web site and allow to adapt the Web site communication to the organization objectives. The Web site chief editor on his part can interpret the metrics to redefine the publishing orders and redefine the sub-editors' writing tasks. As decisions at higher levels in the organization should be more effective, concept-based metrics should significantly contribute to Web usage mining and Web analytics.
|
314 |
Understanding semantic priming: Evidence from masked lexical decision and semantic categorization tasksHector, Johanna Elizabeth January 2005 (has links)
There are now extensive behavioral and neuropsychological evidence to indicate that semantic information of a word can be activated without conscious awareness. However, semantic activation alone may not be sufficient for observing semantic priming effects in masked lexical decision task. In the following study, two tasks were used: lexical decision and semantic categorization. Conscious awareness of the prime was systematically manipulated by varying the duration of the prime and by varying the placement of the mask in the prime-target presentation sequence. Priming effects were observed in the semantic categorization task at prime durations of 42 milliseconds but no semantic priming was observed for the same prime duration in the lexical decision task. However, semantic priming effects began to emerge in lexical decision at the longer prime durations (55 & 69 ms) and under the least effective prime-mask presentation sequences. It is proposed that semantic activation alone is not sufficient for semantic priming effects in the lexical decision task but that central executive involvement is necessary, if only at the lowest level, for facilitatory effects to be observed. Furthermore, no such central executive involvement appears to be required for the semantic categorization task. The priming effects obtained in this task is interpreted in terms of a "decision priming" effect.
|
315 |
Social Tag-based Community Recommendation Using Latent Semantic AnalysisAkther, Aysha 07 September 2012 (has links)
Collaboration and sharing of information are the basis of modern social web system. Users in the social web systems are establishing and joining online communities, in order to collectively share their content with a group of people having common topic of interest. Group or community activities have increased exponentially in modern social Web systems. With the explosive growth of social communities, users of social Web systems have experienced considerable difficulty with discovering communities relevant to their interests. In this study, we address the problem of recommending communities to individual users. Recommender techniques that are based solely on community affiliation, may fail to find a wide range of proper communities for users when their available data are insufficient. We regard this problem as tag-based personalized searches. Based on social tags used by members of communities, we first represent communities in a low-dimensional space, the so-called latent semantic space, by using Latent Semantic Analysis. Then, for recommending communities to a given user, we capture how each community is relevant to both user’s personal tag usage and other community members’ tagging patterns in the latent space. We specially focus on the challenging problem of recommending communities to users who have joined very few communities or having no prior community membership. Our evaluation on two heterogeneous datasets shows that our approach can significantly improve the recommendation quality.
|
316 |
Ontology-based information standards developmentHeravi, Bahareh Rahmanzadeh January 2012 (has links)
Standards may be argued to be important enablers for achieving interoperability as they aim to provide unambiguous specifications for error-free exchange of documents and information. By implication, therefore, it is important to model and represent the concept of a standard in a clear, precise and unambiguous way. Although standards development organisations usually provide guidelines for the process of developing and approving standards, they are usually more concerned with administrative aspect of the process. As a consequence, the state-of-the-art lacks practical support for developing the structure and content of a standard specification. In short, there is no systematic development method currently available: (a) For developing the conceptual model underpinning a standard; and/or (b) to guide a group of stakeholders to develop a standard specification. Semantic interoperability is considered to be an essential factor for effective interoperation – the ability to achieve semantic interoperability effectively and efficiently being strongly equated with quality by some. Semantics require that the meaning of terms, their relationships and also the restrictions and rules in the standards should be clearly defined in the early stages of standard development and act as a basis for the latter stages. This research proposes that ontology can help standards developers and stakeholders to address the issues of improving conceptual models and providing a robust and shared understanding of the domain. This thesis presents OntoStanD, a comprehensive ontology-based standards development methodology, which utilises the best practices of the existing ontology creation methods. The potential value of OntoStanD is in providing a comprehensive, clear and unambiguous method for developing robust information standards, which are more test friendly and of higher quality. OntoStanD also facilitates standards conformance testing and change management, impacts interoperability and also assists in improved communication among the standards development team. Last, OntoStanD provides an approach that is repeatable, teachable and potentially general enough for creating any kinds of information standard.
|
317 |
A Semantic Web based search engine with X3D visualisation of queries and resultsGkoutzis, Konstantinos January 2013 (has links)
The Semantic Web project has introduced new techniques for managing information. Data can now be organised more efficiently and in such a way that computers can take advantage of the relationships that characterise the given input to present more relevant output. Semantic Web based search engines can quickly educe exactly what is needed to be found and retrieve it while avoiding information overload. Up until now, search engines have interacted with their users by asking them to look for words and phrases. We propose the creation of a new generation Semantic Web search engine that will offer a visual interface for queries and results. To create such an engine, information input must be viewed not merely as keywords, but as specific concepts and objects which are all part of the same universal system. To make the manipulation of the interconnected visual objects simpler and more natural, 3D graphics are utilised, based on the X3D Web standard, allowing users to semantically synthesise their queries faster and in a more logical way, both for them and the computer.
|
318 |
Statistical semantic processing using Markov logicMeza-Ruiz, Ivan Vladimir January 2009 (has links)
Markov Logic (ML) is a novel approach to Natural Language Processing tasks [Richardson and Domingos, 2006; Riedel, 2008]. It is a Statistical Relational Learning language based on First Order Logic (FOL) and Markov Networks (MN). It allows one to treat a task as structured classification. In this work, we investigate ML for the semantic processing tasks of Spoken Language Understanding (SLU) and Semantic Role Labelling (SRL). Both tasks consist of identifying a semantic representation for the meaning of a given utterance/sentence. However, they differ in nature: SLU is in the field of dialogue systems where the domain is closed and language is spoken [He and Young, 2005], while SRL is for open domains and traditionally for written text [M´arquez et al., 2008]. Robust SLU is a key component of spoken dialogue systems. This component consists of identifying the meaning of the user utterances addressed to the system. Recent statistical approaches to SLU depend on additional resources (e.g., gazetteers, grammars, syntactic treebanks) which are expensive and time-consuming to produce and maintain. On the other hand, simple datasets annotated only with slot-values are commonly used in dialogue system development, and are easy to collect, automatically annotate, and update. However, slot-values leave out some of the fine-grained long distance dependencies present in other semantic representations. In this work we investigate the development of SLU modules with minimum resources with slot-values as their semantic representation. We propose to use the ML to capture long distance dependencies which are not explicitly available in the slot-value semantic representation. We test the adequacy of the ML framework by comparing against a set of baselines using state of the art approaches to semantic processing. The results of this research have been published in Meza-Ruiz et al. [2008a,b]. Furthermore, we address the question of scalability of the ML approach for other NLP tasks involving the identification of semantic representations. In particular, we focus on SRL: the task of identifying predicates and arguments within sentences, together with their semantic roles. The semantic representation built during SRL is more complex than the slot-values used in dialogue systems, in the sense that they include the notion of predicate/argument scope. SRL is defined in the context of open domains under the premises that there are several levels of extra resources (lemmas, POS tags, constituent or dependency parses). In this work, we propose a ML model of SRL and experiment with the different architectures we can describe for the model which gives us an insight into the types of correlations that the ML model can express [Riedel and Meza-Ruiz, 2008; Meza-Ruiz and Riedel, 2009]. Additionally, we tested our minimal resources setup in a state of the art dialogue system: the TownInfo system. In this case, we were given a small dataset of gold standard semantic representations which were system dependent, and we rapidly developed a SLU module used in the functioning dialogue system. No extra resources were necessary in order to reach state of the art results.
|
319 |
Patterns of Change in Semantic Clustering in Schizophrenia Spectrum Disorders: What Can it Tell Us about the Nature of Clustering DeficitsEdwards, Kimberly 08 1900 (has links)
Semantic clustering has been used as a measure of learning strategies in a number of clinical populations and has been found to be deficient in individuals with Schizophrenia, but less attention has been paid to the dynamic use of this strategy over the course of fixed-order learning trials. In the current study, we examined this pattern of clustering use over trials in a sample of individuals with Schizophrenia, and explored whether the addition of this dynamic information would help us to
better predict specific executive deficits. Results suggested that a decrease in semantic clustering across trials was associated with some executive deficits in the predicted manner. Nonetheless, the overall semantic clustering index generally proved more effective for the purposes, suggesting that in this population, the addition of dynamic information in strategy use is not likely to add considerably to clinical prediction and understanding.
|
320 |
Semantic pluralismViebahn, Emanuel January 2014 (has links)
This thesis defends Semantic Pluralism, the view that sentences express sets of propositions in context. It puts forward two arguments against Contextualism, the main opposing view, on which each sentence expresses exactly one proposition in context. It spells out two versions of Pluralism: Flexible Pluralism, which takes most sentences to be context-sensitive, and Strong Pluralism, which denies that context-sensitivity is widespread. And it defends Flexible Pluralism and Strong Pluralism from a number of objections.
|
Page generated in 0.0732 seconds