Global ETD Search

1	An assessment of the English-language needs of second-year Thai undergraduate engineering students in a Thai public university in Thailand in relation to the second-year EAP program in engineering Kittidhaworn, Patama. January 2001 (has links) Thesis (Ed. D.)--West Virginia University, 2001. / Title from document title page. Document formatted into pages; contains xi, 122 p. : ill. Includes abstract. Includes bibliographical references (p. 92-97). English language Engineering
2	Interpretation of anaphoric expressions in the Lolita system Urbanowicz, Agnieszka Joanna January 1998 (has links) This thesis addresses the issue of anaphora resolution in the large scale natural language system, LOLITA. The work described here involved a thorough analysis of the system’s initial performance, the collection of evidence for and the design of the new anaphora resolution algorithm, and subsequent implementation and evaluation of the system. Anaphoric expressions are elements of a discourse whose resolution depends on other elements of the preceding discourse. The processes involved in anaphora resolution have long been the subject of research in a variety of fields. The changes carried out to LOLITA first involved substantial improvements to the core, lower level modules which form the basis of the system. A major change specific to the interpretation of anaphoric expressions was then introduced. A system of filters, in which potential candidates for resolution are filtered according to a set of heuristics, has been changed to a system of penalties, where candidates accumulate points throughout the application of the heuristics. At the end of the process, the candidate with the smallest penalty is chosen as a referent. New heuristics, motivated by evidence drawn from research in linguistics, psycholinguistics and AI, have been added to the system. The system was evaluated using a procedure similar to that defined by MUC6 (DARPA 1995). Blind and open tests were used. The first evaluation was carried out after the general improvements to the lower level modules; the second after the introduction of the new anaphora algorithm. It was found that the general improvements led to a considerable rise in scores in both the blind and the open test sets. As a result of the anaphora specific improvements, on the other hand, the rise in scores on the open set was larger than the rise on the blind set. In the open set the category of pronouns showed the most marked improvement. It was concluded that it is the work carried out to the basic, lower level modules of a large scale system which leads to biggest gains. It was also concluded that considerable extra advantage can be gained by using the new weights-based algorithm together with the generally improved system. 005 Natural language engineering
3	Topic indexing and retrieval for open domain factoid question answering Ahn, Kisuh January 2009 (has links) Factoid Question Answering is an exciting area of Natural Language Engineering that has the potential to replace one major use of search engines today. In this dissertation, I introduce a new method of handling factoid questions whose answers are proper names. The method, Topic Indexing and Retrieval, addresses two issues that prevent current factoid QA system from realising this potential: They can’t satisfy users’ demand for almost immediate answers, and they can’t produce answers based on evidence distributed across a corpus. The first issue arises because the architecture common to QA systems is not easily scaled to heavy use because so much of the work is done on-line: Text retrieved by information retrieval (IR) undergoes expensive and time-consuming answer extraction while the user awaits an answer. If QA systems are to become as heavily used as popular web search engines, this massive process bottle-neck must be overcome. The second issue of how to make use of the distributed evidence in a corpus is relevant when no single passage in the corpus provides sufficient evidence for an answer to a given question. QA systems commonly look for a text span that contains sufficient evidence to both locate and justify an answer. But this will fail in the case of questions that require evidence from more than one passage in the corpus. Topic Indexing and Retrieval method developed in this thesis addresses both these issues for factoid questions with proper name answers by restructuring the corpus in such a way that it enables direct retrieval of answers using off-the-shelf IR. The method has been evaluated on 377 TREC questions with proper name answers and 41 questions that require multiple pieces of evidence from different parts of the TREC AQUAINT corpus. With regards to the first evaluation, scores of 0.340 in Accuracy and 0.395 in Mean Reciprocal Rank (MRR) show that the Topic Indexing and Retrieval performs well for this type of questions. A second evaluation compares performance on a corpus of 41 multi-evidence questions by a question-factoring baseline method that can be used with the standard QA architecture and by my Topic Indexing and Retrieval method. The superior performance of the latter (MRR of 0.454 against 0.341) demonstrates its value in answering such questions. 004
4	Translation memory-systemer som værktøjer til juridisk oversættelse : kritisk vurdering af anvendeligheden af translation memory-systemer til oversættelse af selskabsretlig dokumentation. Christensen, Tina Paulsen. January 2003 (has links) (PDF) Ph.D-afhandling.
5	Language Engineering for Information Extraction Schierle, Martin 10 January 2012 (has links) (PDF) Accompanied by the cultural development to an information society and knowledge economy and driven by the rapid growth of the World Wide Web and decreasing prices for technology and disk space, the world\'s knowledge is evolving fast, and humans are challenged with keeping up. Despite all efforts on data structuring, a large part of this human knowledge is still hidden behind the ambiguities and fuzziness of natural language. Especially domain language poses new challenges by having specific syntax, terminology and morphology. Companies willing to exploit the information contained in such corpora are often required to build specialized systems instead of being able to rely on off the shelf software libraries and data resources. The engineering of language processing systems is however cumbersome, and the creation of language resources, annotation of training data and composition of modules is often enough rather an art than a science. The scientific field of Language Engineering aims at providing reliable information, approaches and guidelines of how to design, implement, test and evaluate language processing systems. Language engineering architectures have been a subject of scientific work for the last two decades and aim at building universal systems of easily reusable components. Although current systems offer comprehensive features and rely on an architectural sound basis, there is still little documentation about how to actually build an information extraction application. Selection of modules, methods and resources for a distinct usecase requires a detailed understanding of state of the art technology, application demands and characteristics of the input text. The main assumption underlying this work is the thesis that a new application can only occasionally be created by reusing standard components from different repositories. This work recapitulates existing literature about language resources, processing resources and language engineering architectures to derive a theory about how to engineer a new system for information extraction from a (domain) corpus. This thesis was initiated by the Daimler AG to prepare and analyze unstructured information as a basis for corporate quality analysis. It is therefore concerned with language engineering in the area of Information Extraction, which targets the detection and extraction of specific facts from textual data. While other work in the field of information extraction is mainly concerned with the extraction of location or person names, this work deals with automotive components, failure symptoms, corrective measures and their relations in arbitrary arity. The ideas presented in this work will be applied, evaluated and demonstrated on a real world application dealing with quality analysis on automotive domain language. To achieve this goal, the underlying corpus is examined and scientifically characterized, algorithms are picked with respect to the derived requirements and evaluated where necessary. The system comprises language identification, tokenization, spelling correction, part of speech tagging, syntax parsing and a final relation extraction step. The extracted information is used as an input to data mining methods such as an early warning system and a graph based visualization for interactive root cause analysis. It is finally investigated how the unstructured data facilitates those quality analysis methods in comparison to structured data. The acceptance of these text based methods in the company\'s processes further proofs the usefulness of the created information extraction system. Textanalyse Qualitätsanalyse Informationsextraktion Text Mining Information Extraction Quality Analysis Language Engineering ddc:000
6	The Development of materials for teaching English to Hong Kong Polytechnic engineering students / Ng, Kam-ling, Evelyn. January 1979 (has links) Thesis (M. Ed.)--University of Hong Kong, 1980.
7	The design, construction, and implementation of an engineering software command processor and macro compiler / Coleman, Jesse J. January 1995 (has links) Thesis (M.S.)--Rochester Institute of Technology, 1995. / Typescript. Includes bibliographical references (leaves 186-187).
8	The development of materials for teaching English to Hong Kong Polytechnic engineering students Ng, Kam-ling, Evelyn. January 1979 (has links) Thesis (M.Ed.)--University of Hong Kong, 1980. / Also available in print.
9	Natural Language Processing from a Software Engineering Perspective Åkerud, Daniel, Rendlo, Henrik January 2004 (has links) This thesis is intended to deal with questions related to the processing of naturally occurring texts, also known as natural language processing (NLP). The subject will be approached from a software engineering perspective, and the problem description will be formulated thereafter. The thesis is roughly divided into two major parts. The first part contains a literature study covering fundamental concepts and algorithms. We discuss both serial and parallel architectures, and conclude that different scenarios call for different architectures. The second part is an empirical evaluation of an NLP framework or toolkit chosen amongst a few, conducted in order to elucidate the theoretical part of the thesis. We argue that component based development in a portable language could increase the reusability in the NLP community, where reuse is currently low. The recent emergence of the discovered initiatives and the great potential of many applications in this area reveal a bright future for NLP. Natural Language Processing Software Engineering Language Engineering Architecture Software Engineering Programvaruteknik
10	Leveraging software product lines engineering in the construction of domain specific languages / Usage de l'ingénierie de lignes de produits pour la construction de langages dédiés Méndez Acuña, David Fernando 16 December 2016 (has links) La complexité croissante des systèmes logiciels modernes a motivé la nécessité d'élever le niveau d'abstraction dans leur conception et mis en œuvre. L'usage des langages dédiés a émergé pour répondre à cette nécessité. Un langage dédié permet de spécifier un système logiciel à travers des concepts relatifs au domaine d'application. Cette approche a plusieurs avantages tels que la diminution des détails techniques auxquels les développeurs doivent faire face, la séparation des préoccupations et la participation des experts du domaine dans le processus de développement. Malgré les avantages fournis par l'usage des langages dédiés, cette approche présente des inconvénients qui remettent en question sa pertinence dans des projets réels de développement logiciel. L'un de ces inconvénients est le coût de la construction des langages dédiés. La définition et l'outillage de ces langages est une tâche complexe qui prend du temps et qui requiert des compétences techniques spécialisées. Le processus de développement des langages dédiés devient encore plus complexe lorsque nous prenons en compte le fait que ces langages peuvent avoir plusieurs dialectes. Dans ce contexte, un dialecte est une variante d'un langage qui introduit des différences au niveau de la syntaxe et/ou de la sémantique. Afin de réduire le coût du processus de développement des langages dédiés, les concepteurs des langages doivent réutiliser autant de définitions que possible pendant la construction des variantes. Le but est d'exploiter les définitions et l'outillage définis précédemment pour dunaire au maximum, la mis en ouvre des zéro dans la construction de langages. Afin de répondre à la question de recherche précédemment énoncée, la communauté de recherche autour de l'ingénierie des langages a proposé l'usage des lignes de produits. En conséquence, la notion de lignes de langages a récemment émergé. Une ligne de langages est une ligne de produis où les produits sont des langages. Le principal but dans les lignes de langages est la définition indépendante de morceaux de langage. Ces morceaux peuvent être combinées de manières différentes pour configurer des langages adaptés aux situations spécifiques. D'une manière similaire aux lignes de produits, les lignes de langages peuvent être construites à partir de deux approches différentes: top-down et bottom-up . Dans l'approche top-down, les lignes de langages sont conçues et mis en œuvre au travers d'un processus d'analyse du domaine où les connaissances du domaine sont utilisées pour définir un ensemble de modules de langage qui réalisent les caractéristiques de la ligne de langages. En outre, les connaissances du domaine sont aussi utilisées pour représenter la variabilité de la ligne de langages à travers des modèles bien structurés qui, en plus, servent à configurer des langages particuliers. Dans l'approche bottom-up, les lignes des langages sont construites à partir d'un ensemble de variantes des langages existant au travers de techniques d'ingénierie inverse. À partir des approches précédemment énoncées, nous proposons deux contributions : (1) Des facilités pour supporter l'approche top-down. Nous proposons une approche de modularisation des langages qui permet la décomposition des langages dédiés comme modules de langages interdépendants. En plus, nous introduisons une stratégie de modélisation pour représenter la variabilité dans une ligne de langages. (2) Techniques d'ingénierie inverse pour supporter l'approche bottom-up. Comme deuxième contribution, nous proposons une technique d'ingénierie inverse pour construire, de manière automatique, une ligne de langages à partir d'un ensemble de variantes de langages existantes. Nos contributions sont validées à travers des cas d'étude industriels. / The use of domain-specific languages (DSLs) has become a successful technique in the development of complex systems because it furnishes benefits such as abstraction, separation of concerns, and improvement of productivity. Nowadays, we can find a large variety of DSLs providing support in various domains. However, the construction of these languages is an expensive task. Language designers are intended to invest an important amount of time and effort in the definition of formal specifications and tooling for the DSLs that tackle the requirements of their companies. The construction of DSLs becomes even more challenging in multi-domain companies that provide several products. In this context, DSLs should be often adapted to diverse application scenarios, so language development projects address the construction of several variants of the same DSL. At this point, language designers face the challenge of building all the required variants by reusing, as much as possible, the commonalities existing among them. The objective is to leverage previous engineering efforts to minimize implementation from scratch. As an alternative to deal with such a challenge, recent research in software language engineering has proposed the use of product line engineering techniques to facilitate the construction of DSL variants. This led the notion of language product lines i.e., software product lines where the products are languages. Similarly to software product lines, language product lines can be built through two different approaches: top-down and bottom-up. In the top-down approach, a language product line is designed and implemented through a domain analysis process. In the bottom-up approach, the language product line is built up from a set of existing DSL variants through reverse-engineering techniques. In this thesis, we provide support for the construction of language product lines according to the two approaches mentioned before. On one hand, we propose facilities in terms of language modularization and variability management to support the top-down approach. Those facilities are accompanied with methodological insights intended to guide the domain analysis process. On the other hand, we introduce a reverse-engineering technique to support the bottom-up approach. This technique includes a mechanism to automatically recover a language modular design for the language product line as we as a strategy to synthesize a variability model that can be later used to configure concrete DSL variants. The ideas presented in this thesis are implemented in a well-engineered language workbench. This implementation facilitates the validation of our contributions in three case studies. The first case study is dedicated to validate our languages modularization approach that, as we will explain later in this document, is the backbone of any approach supporting language product lines. The second and third case studies are intended to validate our contributions on top-down and bottom-up language product lines respectively. Ingénierie logicielle Ingénierie de langages Lignes de produits Software engineering Software language engineering Software product lines

Search results