• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 943
  • 156
  • 74
  • 56
  • 27
  • 23
  • 18
  • 13
  • 10
  • 9
  • 8
  • 7
  • 5
  • 5
  • 4
  • Tagged with
  • 1622
  • 1622
  • 1622
  • 626
  • 573
  • 469
  • 387
  • 376
  • 271
  • 256
  • 246
  • 230
  • 221
  • 212
  • 208
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
411

Improving the secondary utilization of clinical data by incorporating context

D'Avolio, Leonard W., Rees, Galya, Boyadzhyan, Lousine January 2006 (has links)
This is a submission to the "Interrogating the social realities of information and communications systems pre-conference workshop, ASIST AM 2006." There is great potential in the utilization of existing clinical data to assist in decision support, epidemiology, and information retrieval. As we transition from evaluating systemsâ abilities to accurately capture the information in the record, to the clinical application of results, we must incorporate the contextual influences that affect such efforts. A methodology is proposed to assist researchers in identifying strengths and weaknesses of clinical data for application to secondary purposes. The results of its application to three ongoing clinical research projects are discussed.
412

Identifying Latent Attributes from Video Scenes Using Knowledge Acquired From Large Collections of Text Documents

Tran, Anh Xuan January 2014 (has links)
Peter Drucker, a well-known influential writer and philosopher in the field of management theory and practice, once claimed that “the most important thing in communication is hearing what isn't said.” It is not difficult to see that a similar concept also holds in the context of video scene understanding. In almost every non-trivial video scene, most important elements, such as the motives and intentions of the actors, can never be seen or directly observed, yet the identification of these latent attributes is crucial to our full understanding of the scene. That is to say, latent attributes matter. In this work, we explore the task of identifying latent attributes in video scenes, focusing on the mental states of participant actors. We propose a novel approach to the problem based on the use of large text collections as background knowledge and minimal information about the videos, such as activity and actor types, as query context. We formalize the task and a measure of merit that accounts for the semantic relatedness of mental state terms, as well as their distribution weights. We develop and test several largely unsupervised information extraction models that identify the mental state labels of human participants in video scenes given some contextual information about the scenes. We show that these models produce complementary information and their combination significantly outperforms the individual models, and improves performance over several baseline methods on two different datasets. We present an extensive analysis of our models and close with a discussion of our findings, along with a roadmap for future research.
413

Outomatiese Afrikaanse woordsoortetikettering / deur Suléne Pilon

Pilon, Suléne January 2005 (has links)
Any community that wants to be part of technological progress has to ensure that the language(s) of that community has/have the necessary human language technology resources. Part of these resources are so-called "core technologies", including part-of-speech taggers. The first part-of-speech tagger for Afrikaans is developed in this research project. It is indicated that three resources (a tag set, a twig algorithm and annotated training data) are necessary for the development of such a part-of-speech tagger. Since none of these resources exist for Afrikaans, three objectives are formulated for this project, i.e. (a) to develop a linpsticdy accurate tag set for Afrikaans; (b) to deter- mine which algorithm is the most effective one to use; and (c) to find an effective method for generating annotated Afrikaans training data. To reach the first objective, a unique and language-specific tag set was developed for Afrikaans. The resulting tag set is relatively big and consists of 139 tags. The level of specificity of the tag set can easily be adjusted to make the tag set smaller and less specific. After the development of the tag set, research is done on different approaches to, and techniques that can be used in, the development of a part-of-speech tagger. The available algorithms are evaluated by means of prerequisites that were set and in doing so, the most effective algorithm for the purposes of this project, TnT, is identified. Bootstrapping is then used to generate training data with the help of the TnT algorithm. This process results in 20,000 correctly annotated words, and thus annotated training data, the hard resource which is necessary for the development of a part-of-speech tagger, is developed. The tagger that is trained with 20,000 words reaches an accuracy of 85.87% when evaluated. The tag set is then simplified to thirteen tags in order to determine the effect that the size of the tag set has on the accuracy of the tagger. The tagger is 93.69% accurate when using the diminished tag set. The main conclusion of this study is that training data of 20,000 words is not enough for the Afrikaans TnT tagger to compete with other state-of-the-art taggers. The tagger and the data that is developed in this project can be used to generate even more training data in order to develop an optimally accurate Afrikaans TnT tagger. Different techniques might also lead to better results; therefore other algorithms should be tested. / Thesis (M.A.)--North-West University, Potchefstroom Campus, 2005.
414

Natūralios kalbos apdorojimo terminų ontologija: kūrimo problemos ir jų sprendimo būdai / Ontology of natural language processing terms: development issues and their solutions

Ramonas, Vilmantas 17 June 2010 (has links)
Šiame darbe aptariamas natūralios kalbos apdorojimo terminų ontologijos kūrimas, kūrimo problemos ir jų sprendimo būdai. Tam, iš skirtingų šaltinių surinkta 217 NLP terminų. Terminai išversti į lietuvių kalbą. Trumpai aptartos problemos verčiant. Aprašytos tiek kompiuterinės, tiek filosofinės ontologijos, paminėti jų panašumai ir skirtumai. Išsamiau aptartas filosofinis požiūris į sąvokų ir daiktų panašumą, ką reikia žinoti, siekiant kiek galima geriau suprasti kompiuterinių ontologijų sudarymo principus. Išnagrinėtas pats NLP terminas, kas sudaro NLP, kokios natūralios kalbos apdorojimo technologijos jau sukurtos, kokios dar kuriamos. NLP terminų ontologijos sudarymui pasirinkus Teminių žemėlapių ontologijos struktūrą ir principus, plačiai aprašyti Teminių žemėlapių (TM) sudarymo principai, pagrindinės TM sudedamosios dalys: temos, temų vardai, asociacijos, vaidmenys asociacijose ir kiti. Vėliau, iš turimų terminų, paliekant tokią struktūrą, kokia rasta šaltinyje, nubraižytas medis. Prieita išvados, jog terminų skaičių reikia mažinti ir atsisakyti pirminės iš šaltinių atsineštos struktūros. Tad palikti tik 69 terminai, darant prielaidą, jog šie svarbiausi. Šiems terminams priskirta keliolika tipų, taip juos suskirstant į grupes. Ieškant dar geresnio skirstymo būdo, kiekvienam iš terminų priskirtas vienas ar keli jį geriausiai nusakantys meta aprašymai, pvz.: mašininis vertimas – vertimas, aukštas automatizavimo lygis. Visi meta aprašymai suskirstyti į 7 stambiausias grupes... [toliau žr. visą tekstą] / In this work it is discussed the development of ontology of natural language processing terms, developmental problems and their solutions. In order to reveal the topic of this work was gathered a collection of 217 NLP terms from different sources. The terms were translated into Lithuanian language. Briefly were revealed the problems of translation. There were described both the computer and philosophical ontology, mentioned their similarities and differences. There was discussed in detail the philosophical approach to the similarity of concepts and objects which is needed to know seeking to understand the ontology of computer principles as much as possible. There was examined the term of NLP, what is the NLP, which natural language processing technologies have already been developed, which are still being developed. For the composition of ontology of NLP terms were chosen the structure and principles of the Topic Maps in order to describe in broad the principles of composition of Topic Maps (TM), the main components of TM: theme, topic names, associations, role in association and others. Later from the got terms there was drawn the tree leaving the structure which was found in the source. It was found that the number of terms should be reduced and it is needed to refuse the primary structure taken from the sources. So, there were left only 69 terms, assuming that they are the most important. There were assigned several types for these terms dividing them into the groups... [to full text]
415

Data Mining in Social Media for Stock Market Prediction

Xu, Feifei 09 August 2012 (has links)
In this thesis, machine learning algorithms are used in NLP to get the public sentiment on individual stocks from social media in order to study its relationship with the stock price change. The NLP approach of sentiment detection is a two-stage process by implementing Neutral v.s. Polarized sentiment detection before Positive v.s. Negative sentiment detection, and SVMs are proved to be the best classifiers with the overall accuracy rates of 71.84% and 74.3%, respectively. It is discovered that users’ activity on StockTwits overnight significantly positively correlates to the stock trading volume the next business day. The collective sentiments for afterhours have powerful prediction on the change of stock price for the next day in 9 out of 15 stocks studied by using the Granger Causality test; and the overall accuracy rate of predicting the up and down movement of stocks by using the collective sentiments is 58.9%.
416

Answer extraction for simple and complex questions

Joty, Shafiz Rayhan, University of Lethbridge. Faculty of Arts and Science January 2008 (has links)
When a user is served with a ranked list of relevant documents by the standard document search engines, his search task is usually not over. He has to go through the entire document contents to find the precise piece of information he was looking for. Question answering, which is the retrieving of answers to natural language questions from a document collection, tries to remove the onus on the end-user by providing direct access to relevant information. This thesis is concerned with open-domain question answering. We have considered both simple and complex questions. Simple questions (i.e. factoid and list) are easier to answer than questions that have complex information needs and require inferencing and synthesizing information from multiple documents. Our question answering system for simple questions is based on question classification and document tagging. Question classification extracts useful information (i.e. answer type) about how to answer the question and document tagging extracts useful information from the documents, which is used in finding the answer to the question. For complex questions, we experimented with both empirical and machine learning approaches. We extracted several features of different types (i.e. lexical, lexical semantic, syntactic and semantic) for each of the sentences in the document collection in order to measure its relevancy to the user query. One hill climbing local search strategy is used to fine-tune the feature-weights. We also experimented with two unsupervised machine learning techniques: k-means and Expectation Maximization (EM) algorithms and evaluated their performance. For all these methods, we have shown the effects of different kinds of features. / xi, 214 leaves : ill. (some col.) ; 29 cm. --
417

The Effect of Natural Language Processing in Bioinspired Design

Burns, Madison Suzann 1987- 14 March 2013 (has links)
Bioinspired design methods are a new and evolving collection of techniques used to extract biological principles from nature to solve engineering problems. The application of bioinspired design methods is typically confined to existing problems encountered in new product design or redesign. A primary goal of this research is to utilize existing bioinspired design methods to solve a complex engineering problem to examine the versatility of the method in solving new problems. Here, current bioinspired design methods are applied to seek a biologically inspired solution to geoengineering. Bioinspired solutions developed in the case study include droplet density shields, phosphorescent mineral injection, and reflective orbiting satellites. The success of the methods in the case study indicates that bioinspired design methods have the potential to solve new problems and provide a platform of innovation for old problems. A secondary goal of this research is to help engineers use bioinspired design methods more efficiently by reducing post-processing time and eliminating the need for extensive knowledge of biological terminology by applying natural language processing techniques. Using the complex problem of geoengineering, a hypothesis is developed that asserts the usefulness of nouns in creating higher quality solutions. A designation is made between the types of nouns in a sentence, primary and spatial, and the hypothesis is refined to state that primary nouns are the most influential part of speech in providing biological inspiration for high quality ideas. Through three design experiments, the author determines that engineers are more likely to develop a higher quality solution using the primary noun in a given passage of biological text. The identification of primary nouns through part of speech tagging will provide engineers an analogous biological system without extensive analysis of the results. The use of noun identification to improve the efficiency of bioinspired design method applications is a new concept and is the primary contribution of this research.
418

Development of a Graphics Ontology for Natural Language Interfaces

Niknam, Mehdi 13 October 2010 (has links)
The overall context of this thesis research is to explore natural language as a medium to interact with computer software in the graphics domain, e.g. programs like MS Paint or OpenGL. A core element of most natural language understanding systems is an ontology, which represents concepts and items of the underlying domain of discourse. This thesis presents an ontology for the graphics domain based on several resources, including documentation and textbooks on graphics systems, existing ontologies, and - most importantly - a collection of natural language instructions to create and modify graphic images. The ontology was developed in several phases, and finally tested as part of a complex natural language interface. This natural language interface accepts verbal instructions in the graphics domain as input and creates matching graphic images as output. The results of our tests indicate an accuracy of the system in the area of 80%.
419

A functional theory of creative reading : process, knowledge, and evaluation

Moorman, Kenneth Matthew 08 1900 (has links)
No description available.
420

Efficient prediction of relational structure and its application to natural language processing

Riedel, Sebastian January 2009 (has links)
Many tasks in Natural Language Processing (NLP) require us to predict a relational structure over entities. For example, in Semantic Role Labelling we try to predict the ’semantic role’ relation between a predicate verb and its argument constituents. Often NLP tasks not only involve related entities but also relations that are stochastically correlated. For instance, in Semantic Role Labelling the roles of different constituents are correlated: we cannot assign the agent role to one constituent if we have already assigned this role to another. Statistical Relational Learning (also known as First Order Probabilistic Logic) allows us to capture the aforementioned nature of NLP tasks because it is based on the notions of entities, relations and stochastic correlations between relationships. It is therefore often straightforward to formulate an NLP task using a First Order probabilistic language such as Markov Logic. However, the generality of this approach comes at a price: the process of finding the relational structure with highest probability, also known as maximum a posteriori (MAP) inference, is often inefficient, if not intractable. In this work we seek to improve the efficiency of MAP inference for Statistical Relational Learning. We propose a meta-algorithm, namely Cutting Plane Inference (CPI), that iteratively solves small subproblems of the original problem using any existing MAP technique and inspects parts of the problem that are not yet included in the current subproblem but could potentially lead to an improved solution. Our hypothesis is that this algorithm can dramatically improve the efficiency of existing methods while remaining at least as accurate. We frame the algorithm in Markov Logic, a language that combines First Order Logic and Markov Networks. Our hypothesis is evaluated using two tasks: Semantic Role Labelling and Entity Resolution. It is shown that the proposed algorithm improves the efficiency of two existing methods by two orders of magnitude and leads an approximate method to more probable solutions. We also give show that CPI, at convergence, is guaranteed to be at least as accurate as the method used within its inner loop. Another core contribution of this work is a theoretic and empirical analysis of the boundary conditions of Cutting Plane Inference. We describe cases when Cutting Plane Inference will definitely be difficult (because it instantiates large networks or needs many iterations) and when it will be easy (because it instantiates small networks and needs only few iterations).

Page generated in 0.1324 seconds