Global ETD Search

401	Word meaning in context as a paraphrase distribution : evidence, learning, and inference Moon, Taesun, Ph. D. 25 October 2011 (has links) In this dissertation, we introduce a graph-based model of instance-based, usage meaning that is cast as a problem of probabilistic inference. The main aim of this model is to provide a flexible platform that can be used to explore multiple hypotheses about usage meaning computation. Our model takes up and extends the proposals of Erk and Pado [2007] and McCarthy and Navigli [2009] by representing usage meaning as a probability distribution over potential paraphrases. We use undirected graphical models to infer this probability distribution for every content word in a given sentence. Graphical models represent complex probability distributions through a graph. In the graph, nodes stand for random variables, and edges stand for direct probabilistic interactions between them. The lack of edges between any two variables reflect independence assumptions. In our model, we represent each content word of the sentence through two adjacent nodes: the observed node represents the surface form of the word itself, and the hidden node represents its usage meaning. The distribution over values that we infer for the hidden node is a paraphrase distribution for the observed word. To encode the fact that lexical semantic information is exchanged between syntactic neighbors, the graph contains edges that mirror the dependency graph for the sentence. Further knowledge sources that influence the hidden nodes are represented through additional edges that, for example, connect to document topic. The integration of adjacent knowledge sources is accomplished in a standard way by multiplying factors and marginalizing over variables. Evaluating on a paraphrasing task, we find that our model outperforms the current state-of-the-art usage vector model [Thater et al., 2010] on all parts of speech except verbs, where the previous model wins by a small margin. But our main focus is not on the numbers but on the fact that our model is flexible enough to encode different hypotheses about usage meaning computation. In particular, we concentrate on five questions (with minor variants): - Nonlocal syntactic context: Existing usage vector models only use a word's direct syntactic neighbors for disambiguation or inferring some other meaning representation. Would it help to have contextual information instead "flow" along the entire dependency graph, each word's inferred meaning relying on the paraphrase distribution of its neighbors? - Influence of collocational information: In some cases, it is intuitively plausible to use the selectional preference of a neighboring word towards the target to determine its meaning in context. How does modeling selectional preferences into the model affect performance? - Non-syntactic bag-of-words context: To what extent can non-syntactic information in the form of bag-of-words context help in inferring meaning? - Effects of parametrization: We experiment with two transformations of MLE. One interpolates various MLEs and another transforms it by exponentiating pointwise mutual information. Which performs better? - Type of hidden nodes: Our model posits a tier of hidden nodes immediately adjacent the surface tier of observed words to capture dynamic usage meaning. We examine the model based on by varying the hidden nodes such that in one the nodes have actual words as values and in the other the nodes have nameless indexes as values. The former has the benefit of interpretability while the latter allows more standard parameter estimation. Portions of this dissertation are derived from joint work between the author and Katrin Erk [submitted]. / text Computational linguistics Lexical semantics Probabilistic graphical models Natural language processing Word sense disambiguation Paraphrasing
402	Σχεδιασμός, κατασκευή και αξιολόγηση ελληνικού γραμματικού διορθωτή Γάκης, Παναγιώτης 07 May 2015 (has links) Στόχος της παρούσας διδακτορικής διατριβής είναι ο σχεδιασμός και η υλοποίηση ενός ηλεκτρονικού εύχρηστου εργαλείου (γραμματικού διορθωτή) που θα προβαίνει στη μορφολογική και συντακτική ανάλυση φράσεων, προτάσεων και λέξεων με σκοπό τη διόρθωση γραμματικών και υφολογικών λαθών. Βάση για την αντιμετώπιση όλων αυτών των ζητημάτων συνιστούν οι ρυθμίσεις της Γραμματικής (αναπροσαρμογή της Μικρής Νεοελληνικής Γραμματικής του Μανόλη Τριανταφυλλίδη), η οποία αποτελεί την επίσημη, από το 1976, γραμματική κωδικοποίηση της νεοελληνικής γλώσσας. (Κατά την εκπόνηση της διατριβής δεν έχουν ληφθεί υπόψη οι -ελάχιστες- διαφορές της νέας σχολικής γραμματικής Ε΄ και Στ΄ Δημοτικού). Με δεδομένη την απουσία ενός τέτοιου εργαλείου για τα ελληνικά, η ανάπτυξη του προϊόντος θα βασίζεται καταρχήν στη λεπτομερή καταγραφή, ανάλυση και τυποποίηση των λαθών του γραπτού λόγου και στη συνέχεια στην επιλογή του λογισμικού εκείνου που θα περιγράφει φορμαλιστικά τα γραμματικά λάθη. Η διατριβή παρουσιάζει στατιστικά στοιχεία που αφορούν τη σχέση των λαθών με το φύλο ή με το κειμενικό είδος των κειμένων στα οποία και συναντούνται όπως επίσης και την αναγνώρισή τους από μαθητές. Στην παρούσα έρευνα παρουσιάζεται ο φορμαλισμός υλοποίησης που χρησιμοποιήθηκε (Mnemosyne) και παρουσιάζονται οι ιδιαιτερότητες της ελληνικής γλώσσας που δυσχεραίνουν την υπολογιστική επεξεργασία της. Ο φορμαλισμός αυτός έχει ήδη χρησιμοποιηθεί για αναγνώριση πολυλεκτικών όρων καθώς και για την υλοποίηση ηλεκτρονικών εργαλείων (γραμματικών) με στόχο την αυτόματη εξαγωγή πληροφορίας. Με αυτό τον τρόπο όλοι οι χρήστες της γλώσσας (και όχι μόνο αυτοί που έχουν την ελληνική ως μητρική γλώσσα) μπορούν να κατανοήσουν καλύτερα όχι μόνον τη λειτουργία των διαφόρων μερών του συστήματος της γλώσσας αλλά και τον τρόπο με τον οποίο λειτουργούν οι μηχανισμοί λειτουργίας του γλωσσικού συστήματος κατά τη γλωσσική ανάλυση . 14 Οι βασικές περιοχές γραμματικών λαθών όπου θα παρεμβαίνει ο γραμματικός διορθωτής θα είναι: 1) θέματα τονισμού και στίξης, 2) τελικό -ν, 3) υφολογικά ζητήματα (ρηματικοί τύποι σε περιπτώσεις διπλοτυπίας, κλιτικοί τύποι), 4) ζητήματα καθιερωμένης γραφής λέξεων ή φράσεων της νέας ελληνικής γλώσσας (στερεότυπες φράσεις, λόγιοι τύποι), 5) ζητήματα κλίσης (λανθασμένοι κλιτικοί τύποι ονομάτων ή ρημάτων είτε λόγω άγνοιας είτε λόγω σύγχυσης), 6) ζητήματα λεξιλογίου (περιπτώσεις εννοιολογικής σύγχυσης, ελληνικές αποδόσεις ξένων λέξεων, πλεονασμός, χρήση εσφαλμένης φράσης ή λέξης), 7) ζητήματα ορθογραφικής σύγχυσης (ομόηχες λέξεις), 8) ζητήματα συμφωνίας (θέματα ασυμφωνίας στοιχείων της ονοματικής ή της ρηματικής φράσης), 9) ζητήματα σύνταξης (σύνταξη ρημάτων) και 10) περιπτώσεις λαθών που απαιτούν πιο εξειδικευμένη διαχείριση ορθογραφικής διόρθωσης. Βάση για την υλοποίηση του λεξικού αποτελεί το ηλεκτρονικό μορφολογικό λεξικό Neurolingo Lexicon1, ένα λεξικό χτισμένο σε ένα μοντέλο 5 επιπέδων με τουλάχιστον 90.000 λήμματα που παράγουν 1.200.000 κλιτικούς τύπους. Οι τύποι αυτοί φέρουν πληροφορία: α) ορθογραφική (ορθή γραφή του κλιτικού τύπου), β) μορφηματική (το είδος των μορφημάτων: πρόθημα, θέμα, επίθημα, κατάληξη, που απαρτίζουν τον κλιτικό τύπο), γ) μορφοσυντακτική (μέρος του λόγου, γένος, πτώση, πρόσωπο κτλ.), δ) υφολογική (τα υφολογικά χαρακτηριστικά του τύπου: προφορικό, λόγιο κτλ.) και ε) ορολογική (επιπλέον πληροφορία για το αν ο τύπος αποτελεί μέρος ειδικού λεξιλογίου). Το λεξικό αυτό αποτελεί και τον θεμέλιο λίθο για την υποστήριξη του γραμματικού διορθωτή (Grammar Checker). Η αξία και ο ρόλος του μορφολογικού λεξικού για την υποστήριξη ενός γραμματικού διορθωτή είναι αυτονόητη, καθώς η μορφολογία είναι το πρώτο επίπεδο γλώσσας που εξετάζεται και το συντακτικό επίπεδο βασίζεται και εξαρτάται από τη μορφολογία των λέξεων. Μείζον πρόβλημα αποτέλεσε η λεξική ασάφεια, προϊόν της πλούσιας μορφολογίας της ελληνικής γλώσσας. Με δεδομένο αυτό το πρόβλημα σχεδιάστηκε ο σχολιαστής (tagger) με αμιγώς γλωσσολογικά κριτήρια για τις περιπτώσεις εκείνες όπου η λεξική ασάφεια αποτελούσε εμπόδιο στην αποτύπωση λαθών στη χρήση της ελληνικής γλώσσας. Στον γραμματικό διορθωτή δόθηκαν προς διόρθωση κείμενα που είχαν διορθωθεί από άνθρωπο. Σε ένα πολύ μεγάλο ποσοστό ο διορθωτής προσεγγίζει τη διόρθωση του ανθρώπου με μόνη διαφοροποίηση εκείνα τα λάθη που αφορούν τη συνοχή του κειμένου και κατ’ επέκταση όλα τα νοηματικά λάθη. / The aim of this thesis is to design and then to implement a useful and friendly electronic tool (grammar checker) which will carry out the morphological and syntactic analysis of sentences, phrases and words in order to correct syntactic, grammatical and stylistic errors. Our foundation so as to deal with all these issues, is the settings of Grammar (adaptation of Little Modern Grammar of Manolis Triantafyllidis), which is the formalconstituted codified grammar of Modern Greek, since 1976. (In the presentation of this thesis it has not been taken into account the -minimum- differences that appear in the new Greek grammar book of the fifth and sixth grade of the elementary school). Bearing in mind that there is a total absence of such a tool in Greek language, the development of the product is based on the detailed record, on the analysis and on the formulation of the errors of writing speech. Additionally, for its development the right software is chosen in order to describe the grammatical errors. In this thesis the statistics demonstrate the link between the errors and the students’ gender or between the errors and the textual type in which these errors appear. Finnally, through the statistics, the link among the errors and their recognition by the students is presented . This research presents the formalism used (the Mnemosyne) and also the particularities of the Greek language that hinder the computational processing. The formalism has already been used to identify multi-word terms and to phrase grammars, aiming to the automatic information extraction. In this way, all speakers (native or not) will be able to understand better not only the function of various parts of the system of the Greek language but also the way the mechanisms of linguistic analysis operate in the conquest and more broadly in the linguistic realization. The main areas of the grammatical errors with which the grammar checker will interfere, are: 1) Punctuation problems, 2) Final -n, 3) Stylistic issues (verb forms in cases of duplicates, inflectional types), 4) Standardization issues (stereotyped phrases, words of literary origin), 5) Inclination issues (incorrect declension of names or verbs either through ignorance or because of confusion) 6) Vocabulary issues (cases of conceptual confusion, Greek translation of foreign words, redundancy and use of incorrect word or phrase), 7) Orthographic confusion issues (homonymous words), 8) Agreement issues (cases of elements of nominal or verbal phrase disagreement), 9) Syntax issues (verbs) and 10) Cases of errors that require more specialized management of the spelling correction. The basis for the implementation is the electronic morphological lexicon (Neurolingo Lexicon), a 5-level lexicon which consists of, at least 90,000 entries that produce ~1,200,000 inflection types. These types carry information: a) spelling (write spelling of inflectional type), b) morpheme information (type of morphemes: prefix, theme, suffix, ending), c) morphosyntactic information (part of speech, gender, case, person, etc.), d) stylistic information (the stylistic characteristics of the type: oral, archaic, etc.) and e) terminology (additional information about whether the word form is part of a special vocabulary).This electronic lexicon is the foundation that supports the grammar checker. The value and the key role of the morphological lexicon in supporting the Greek grammar checker is obvious, since the first level in which the language is examined is the morphology level and since the structural level is not only based but also depends on the morphology of the words. A major problem in processing the natural language was the lexical ambiguity, a product of the highly morphology of the Greek language. Given that the major problem of modern Greek is the lexical ambiguity we designe the Greek tagger grounded on linguistic criteria for those cases where the lexical ambiguity impede the imprint of the errors in Greek language. The texts that were given for correction to the grammar checker were also corrected by a person. In a very large percentage the grammar checker approximates in accuracy the human-corrector. Only when the grammar checker had to deal with mistakes concerning the coherence of the text or with meaning errors, the humman corrector was the only accurate corrector. 489.382 Greek grammar checker Natural language processing (NLP)
403	Automated Discovery and Analysis of Social Networks from Threaded Discussions Gruzd, Anatoliy A, Haythornthwaite, Caroline January 2008 (has links) To gain greater insight into the operation of online social networks, we applied Natural Language Processing (NLP) techniques to text-based communication to identify and describe underlying social structures in online communities. This paper presents our approach and preliminary evaluation for content-based, automated discovery of social networks. Our research question is: What syntactic and semantic features of postings in a threaded discussions help uncover explicit and implicit ties between network members, and which provide a reliable estimate of the strengths of interpersonal ties among the network members? To evaluate our automated procedures, we compare the results from the NLP processes with social networks built from basic who-to-whom data, and a sample of hand-coded data derived from a close reading of the text. For our test case, and as part of ongoing research on networked learning, we used the archive of threaded discussions collected over eight iterations of an online graduate class. We first associate personal names and nicknames mentioned in the postings with class participants. Next we analyze the context in which each name occurs in the postings to determine whether or not there is an interpersonal tie between a sender of the posting and a person mentioned in it. Because information exchange is a key factor in the operation and success of a learning community, we estimate and assign weights to the ties by measuring the amount of information exchanged between each pair of the nodes; information in this case is operationalized as counts of important concept terms in the postings as derived through the NLP analyses. Finally, we compare the resulting network(s) against those derived from other means, including basic who-to-whom data derived from posting sequences (e.g., whose postings follow whose). In this comparison we evaluate what is gained in understanding network processes by our more elaborate analyses. Information Science Distributed Learning Natural Language Processing Research Methods Computational Linguistics Learning Science
404	Improving the secondary utilization of clinical data by incorporating context D'Avolio, Leonard W., Rees, Galya, Boyadzhyan, Lousine January 2006 (has links) This is a submission to the "Interrogating the social realities of information and communications systems pre-conference workshop, ASIST AM 2006." There is great potential in the utilization of existing clinical data to assist in decision support, epidemiology, and information retrieval. As we transition from evaluating systemsâ abilities to accurately capture the information in the record, to the clinical application of results, we must incorporate the contextual influences that affect such efforts. A methodology is proposed to assist researchers in identifying strengths and weaknesses of clinical data for application to secondary purposes. The results of its application to three ongoing clinical research projects are discussed. Data Mining Information Science Natural Language Processing Social Informatics Information Analysis
405	Identifying Latent Attributes from Video Scenes Using Knowledge Acquired From Large Collections of Text Documents Tran, Anh Xuan January 2014 (has links) Peter Drucker, a well-known influential writer and philosopher in the field of management theory and practice, once claimed that “the most important thing in communication is hearing what isn't said.” It is not difficult to see that a similar concept also holds in the context of video scene understanding. In almost every non-trivial video scene, most important elements, such as the motives and intentions of the actors, can never be seen or directly observed, yet the identification of these latent attributes is crucial to our full understanding of the scene. That is to say, latent attributes matter. In this work, we explore the task of identifying latent attributes in video scenes, focusing on the mental states of participant actors. We propose a novel approach to the problem based on the use of large text collections as background knowledge and minimal information about the videos, such as activity and actor types, as query context. We formalize the task and a measure of merit that accounts for the semantic relatedness of mental state terms, as well as their distribution weights. We develop and test several largely unsupervised information extraction models that identify the mental state labels of human participants in video scenes given some contextual information about the scenes. We show that these models produce complementary information and their combination significantly outperforms the individual models, and improves performance over several baseline methods on two different datasets. We present an extensive analysis of our models and close with a discussion of our findings, along with a roadmap for future research. computer vision information extraction information retrieval mental state inference natural language processing Computer Science artificial intelligence
406	Outomatiese Afrikaanse woordsoortetikettering / deur Suléne Pilon Pilon, Suléne January 2005 (has links) Any community that wants to be part of technological progress has to ensure that the language(s) of that community has/have the necessary human language technology resources. Part of these resources are so-called "core technologies", including part-of-speech taggers. The first part-of-speech tagger for Afrikaans is developed in this research project. It is indicated that three resources (a tag set, a twig algorithm and annotated training data) are necessary for the development of such a part-of-speech tagger. Since none of these resources exist for Afrikaans, three objectives are formulated for this project, i.e. (a) to develop a linpsticdy accurate tag set for Afrikaans; (b) to deter- mine which algorithm is the most effective one to use; and (c) to find an effective method for generating annotated Afrikaans training data. To reach the first objective, a unique and language-specific tag set was developed for Afrikaans. The resulting tag set is relatively big and consists of 139 tags. The level of specificity of the tag set can easily be adjusted to make the tag set smaller and less specific. After the development of the tag set, research is done on different approaches to, and techniques that can be used in, the development of a part-of-speech tagger. The available algorithms are evaluated by means of prerequisites that were set and in doing so, the most effective algorithm for the purposes of this project, TnT, is identified. Bootstrapping is then used to generate training data with the help of the TnT algorithm. This process results in 20,000 correctly annotated words, and thus annotated training data, the hard resource which is necessary for the development of a part-of-speech tagger, is developed. The tagger that is trained with 20,000 words reaches an accuracy of 85.87% when evaluated. The tag set is then simplified to thirteen tags in order to determine the effect that the size of the tag set has on the accuracy of the tagger. The tagger is 93.69% accurate when using the diminished tag set. The main conclusion of this study is that training data of 20,000 words is not enough for the Afrikaans TnT tagger to compete with other state-of-the-art taggers. The tagger and the data that is developed in this project can be used to generate even more training data in order to develop an optimally accurate Afrikaans TnT tagger. Different techniques might also lead to better results; therefore other algorithms should be tested. / Thesis (M.A.)--North-West University, Potchefstroom Campus, 2005. Afrikaans Part-of-speech tagging Syntax Morphology Natural language processing Computasional linguistics
407	Natūralios kalbos apdorojimo terminų ontologija: kūrimo problemos ir jų sprendimo būdai / Ontology of natural language processing terms: development issues and their solutions Ramonas, Vilmantas 17 June 2010 (has links) Šiame darbe aptariamas natūralios kalbos apdorojimo terminų ontologijos kūrimas, kūrimo problemos ir jų sprendimo būdai. Tam, iš skirtingų šaltinių surinkta 217 NLP terminų. Terminai išversti į lietuvių kalbą. Trumpai aptartos problemos verčiant. Aprašytos tiek kompiuterinės, tiek filosofinės ontologijos, paminėti jų panašumai ir skirtumai. Išsamiau aptartas filosofinis požiūris į sąvokų ir daiktų panašumą, ką reikia žinoti, siekiant kiek galima geriau suprasti kompiuterinių ontologijų sudarymo principus. Išnagrinėtas pats NLP terminas, kas sudaro NLP, kokios natūralios kalbos apdorojimo technologijos jau sukurtos, kokios dar kuriamos. NLP terminų ontologijos sudarymui pasirinkus Teminių žemėlapių ontologijos struktūrą ir principus, plačiai aprašyti Teminių žemėlapių (TM) sudarymo principai, pagrindinės TM sudedamosios dalys: temos, temų vardai, asociacijos, vaidmenys asociacijose ir kiti. Vėliau, iš turimų terminų, paliekant tokią struktūrą, kokia rasta šaltinyje, nubraižytas medis. Prieita išvados, jog terminų skaičių reikia mažinti ir atsisakyti pirminės iš šaltinių atsineštos struktūros. Tad palikti tik 69 terminai, darant prielaidą, jog šie svarbiausi. Šiems terminams priskirta keliolika tipų, taip juos suskirstant į grupes. Ieškant dar geresnio skirstymo būdo, kiekvienam iš terminų priskirtas vienas ar keli jį geriausiai nusakantys meta aprašymai, pvz.: mašininis vertimas – vertimas, aukštas automatizavimo lygis. Visi meta aprašymai suskirstyti į 7 stambiausias grupes... [toliau žr. visą tekstą] / In this work it is discussed the development of ontology of natural language processing terms, developmental problems and their solutions. In order to reveal the topic of this work was gathered a collection of 217 NLP terms from different sources. The terms were translated into Lithuanian language. Briefly were revealed the problems of translation. There were described both the computer and philosophical ontology, mentioned their similarities and differences. There was discussed in detail the philosophical approach to the similarity of concepts and objects which is needed to know seeking to understand the ontology of computer principles as much as possible. There was examined the term of NLP, what is the NLP, which natural language processing technologies have already been developed, which are still being developed. For the composition of ontology of NLP terms were chosen the structure and principles of the Topic Maps in order to describe in broad the principles of composition of Topic Maps (TM), the main components of TM: theme, topic names, associations, role in association and others. Later from the got terms there was drawn the tree leaving the structure which was found in the source. It was found that the number of terms should be reduced and it is needed to refuse the primary structure taken from the sources. So, there were left only 69 terms, assuming that they are the most important. There were assigned several types for these terms dividing them into the groups... [to full text] Philology Ontologija Natūralios kalbos apdorojimas Teminiai žemėlapiai Ontology Natural language processing Topic maps
408	Data Mining in Social Media for Stock Market Prediction Xu, Feifei 09 August 2012 (has links) In this thesis, machine learning algorithms are used in NLP to get the public sentiment on individual stocks from social media in order to study its relationship with the stock price change. The NLP approach of sentiment detection is a two-stage process by implementing Neutral v.s. Polarized sentiment detection before Positive v.s. Negative sentiment detection, and SVMs are proved to be the best classifiers with the overall accuracy rates of 71.84% and 74.3%, respectively. It is discovered that users’ activity on StockTwits overnight significantly positively correlates to the stock trading volume the next business day. The collective sentiments for afterhours have powerful prediction on the change of stock price for the next day in 9 out of 15 stocks studied by using the Granger Causality test; and the overall accuracy rate of predicting the up and down movement of stocks by using the collective sentiments is 58.9%. data mining sentiment detection machine learning stock market social media natural language processing StockTwits
409	Answer extraction for simple and complex questions Joty, Shafiz Rayhan, University of Lethbridge. Faculty of Arts and Science January 2008 (has links) When a user is served with a ranked list of relevant documents by the standard document search engines, his search task is usually not over. He has to go through the entire document contents to find the precise piece of information he was looking for. Question answering, which is the retrieving of answers to natural language questions from a document collection, tries to remove the onus on the end-user by providing direct access to relevant information. This thesis is concerned with open-domain question answering. We have considered both simple and complex questions. Simple questions (i.e. factoid and list) are easier to answer than questions that have complex information needs and require inferencing and synthesizing information from multiple documents. Our question answering system for simple questions is based on question classification and document tagging. Question classification extracts useful information (i.e. answer type) about how to answer the question and document tagging extracts useful information from the documents, which is used in finding the answer to the question. For complex questions, we experimented with both empirical and machine learning approaches. We extracted several features of different types (i.e. lexical, lexical semantic, syntactic and semantic) for each of the sentences in the document collection in order to measure its relevancy to the user query. One hill climbing local search strategy is used to fine-tune the feature-weights. We also experimented with two unsupervised machine learning techniques: k-means and Expectation Maximization (EM) algorithms and evaluated their performance. For all these methods, we have shown the effects of different kinds of features. / xi, 214 leaves : ill. (some col.) ; 29 cm. -- Dissertations, Academic Electronic dissertations Keyword searching
410	The Effect of Natural Language Processing in Bioinspired Design Burns, Madison Suzann 1987- 14 March 2013 (has links) Bioinspired design methods are a new and evolving collection of techniques used to extract biological principles from nature to solve engineering problems. The application of bioinspired design methods is typically confined to existing problems encountered in new product design or redesign. A primary goal of this research is to utilize existing bioinspired design methods to solve a complex engineering problem to examine the versatility of the method in solving new problems. Here, current bioinspired design methods are applied to seek a biologically inspired solution to geoengineering. Bioinspired solutions developed in the case study include droplet density shields, phosphorescent mineral injection, and reflective orbiting satellites. The success of the methods in the case study indicates that bioinspired design methods have the potential to solve new problems and provide a platform of innovation for old problems. A secondary goal of this research is to help engineers use bioinspired design methods more efficiently by reducing post-processing time and eliminating the need for extensive knowledge of biological terminology by applying natural language processing techniques. Using the complex problem of geoengineering, a hypothesis is developed that asserts the usefulness of nouns in creating higher quality solutions. A designation is made between the types of nouns in a sentence, primary and spatial, and the hypothesis is refined to state that primary nouns are the most influential part of speech in providing biological inspiration for high quality ideas. Through three design experiments, the author determines that engineers are more likely to develop a higher quality solution using the primary noun in a given passage of biological text. The identification of primary nouns through part of speech tagging will provide engineers an analogous biological system without extensive analysis of the results. The use of noun identification to improve the efficiency of bioinspired design method applications is a new concept and is the primary contribution of this research. Part of Speech Primary Noun Biomimetic Natural Language Processing NLP Bioinspired Design

Search results