Global ETD Search

1	From Information Extraction to Knowledge Discovery: Semantic Enrichment of Multilingual Content with Linked Open Data De Wilde, Max 23 October 2015 (has links) Discovering relevant knowledge out of unstructured text in not a trivial task. Search engines relying on full-text indexing of content reach their limits when confronted to poor quality, ambiguity, or multiple languages. Some of these shortcomings can be addressed by information extraction and related natural language processing techniques, but it still falls short of adequate knowledge representation. In this thesis, we defend a generic approach striving to be as language-independent, domain-independent, and content-independent as possible. To reach this goal, we offer to disambiguate terms with their corresponding identifiers in Linked Data knowledge bases, paving the way for full-scale semantic enrichment of textual content. The added value of our approach is illustrated with a comprehensive case study based on a trilingual historical archive, addressing constraints of data quality, multilingualism, and language evolution. A proof-of-concept implementation is also proposed in the form of a Multilingual Entity/Resource Combiner & Knowledge eXtractor (MERCKX), demonstrating to a certain extent the general applicability of our methodology to any language, domain, and type of content. / Découvrir de nouveaux savoirs dans du texte non-structuré n'est pas une tâche aisée. Les moteurs de recherche basés sur l'indexation complète des contenus montrent leur limites quand ils se voient confrontés à des textes de mauvaise qualité, ambigus et/ou multilingues. L'extraction d'information et d'autres techniques issues du traitement automatique des langues permettent de répondre partiellement à cette problématique, mais sans pour autant atteindre l'idéal d'une représentation adéquate de la connaissance. Dans cette thèse, nous défendons une approche générique qui se veut la plus indépendante possible des langues, domaines et types de contenus traités. Pour ce faire, nous proposons de désambiguïser les termes à l'aide d'identifiants issus de bases de connaissances du Web des données, facilitant ainsi l'enrichissement sémantique des contenus. La valeur ajoutée de cette approche est illustrée par une étude de cas basée sur une archive historique trilingue, en mettant un accent particulier sur les contraintes de qualité, de multilinguisme et d'évolution dans le temps. Un prototype d'outil est également développé sous le nom de Multilingual Entity/Resource Combiner & Knowledge eXtractor (MERCKX), démontrant ainsi le caractère généralisable de notre approche, dans un certaine mesure, à n'importe quelle langue, domaine ou type de contenu. / Doctorat en Information et communication / info:eu-repo/semantics/nonPublished Sémantique langages naturels Traitement du langage natural language processing information extraction linked data semantic web digital humanities
2	Making Sense of Mention, Quotation, and Autonymy: A Semantic and Pragmatic Survey of Metalinguistic Discourse De Brabanter, Philippe 19 November 2002 (has links) The goal I have pursued in writing this dissertation has been to provide the most complete account that I could manage of the various aspects of language that can be labelled metalinguistic, both in the language-system and in discourse. On a rough characterisation, metalanguage is language about language. Since I understand language both as a ‘potential’ (the language-system) and as its actualisation (language as discourse), there are theoretically four situations that can be subsumed under the term ‘metalanguage’: 1. there are lexical items (units in the system) that denote aspects of the system (preposition, noun, conjugation, plural, etc.); 2. there are items that denote elements of discourse (words and phrases like the aforementioned, the latter, etc.). At the same time, there are 3. utterances about the system (e.g. ‘Boston’ is a noun), and 4. utterances about discourse (i.e. about other utterances or parts of utterances, e.g. The old cow said teddible instead of terrible). In both 3 and 4, we have words that reflexively mention linguistic sequences. Following Rey-Debove, I have chosen to call these ‘autonyms’.Note also that discourse about language can be combined with discourse about extralinguistic reality. An utterance about a situation in the world can secondarily say something, for example, about language use; such is the case in The U.S. advocates ‘military action’, as newspapermen call it now, where a comment about a euphemism is appended to a statement about ‘the world’.All in all, this amounts to a fairly large body of data that is varied in kind. My goal has been to bring some order to this variegated set, to highlight in what respects its elements are similar and dissimilar. Thus, I have sought to sort out a number of issues that had not, as far as I could judge, been treated satisfactorily on previous occasions, and to make my descriptions compatible with the theory that was gradually taking shape. In particular, I have underlined the strong connections between the system-level aspects of metalanguage and its discourse manifestations, and I have been led to suggest that the latter ‘leak into’ the system. Besides, I have tried to give a more thorough account of certain properties of metalinguistic discourse, notably the recursiveness of mention or quotation, and its referential diversity. When I felt that I had come to an adequate account of metalinguistic discourse, I have attempted to supply a typology of its various manifestations that would integrate most of the criteria brought up in previous attempts. In the final part of the dissertation, I have brought together what I regard as a series of genuine challenges to the best existing theories of metalinguistic discourse, and have attempted to frame what possible solutions could be.THINGS IN SENTENCES, INFINITE LEXICON? P-ÊÊ UNE CODA APRÈS RECA + CHAPTER 8***The very notion of metalanguage originated in formal logic in the first half of the 20th c. Soon, some of the concepts developed by logicians were taken over by philosophers of language (and subsequently by a few linguists). That was notably the case with the distinction between the use and the mention of a linguistic sequence; use designating the ordinary, transparent, employment of an expression to denote something outside language and mention its being chosen as a topic for discussion. When the subject came under the scrutiny of philosophers of language, the essentially prescriptive approach of the logician (the logician decreed which features his languages and metalanguages should possess), was turned into an attempt at describing actual linguistic mechanisms. It is in this tradition that I situate myself.Philosophers of language have turned out to be particularly interested in quotation (the mention of linguistic expressions), but I have thought it useful to introduce a term that covered not just quotation, but also mention-without-quote-marks, as well as hybrid cases like example 5. This term is reflexive metalinguistic demonstration, but for convenience’ sake I shall make do with metalinguistic demonstration.In Chapter 2, I have examined in detail the main theories of metalinguistic demonstration put forward in the course of the 20th c. namely the Name, Description, Demonstrative and Identity theories. In the process, I have been able to gradually identify the various properties of metalinguistic demonstrations that should be regarded as essential. And I have also formed a clearer idea of the body of data that a theory should be able to account for. In the end, I have been able to outline what I believe is a sound theory of metalinguistic demonstrations. This theory is chiefly informed by the proposals of François Recanati (2000, 2001), supplemented with insights of Paul Saka (1998), both of whom are indebted to the Demonstrative and Identity accounts.My reasons for using Recanati (2001) as the backbone of my own theory are the following. Recanati has successfully drawn the line between two types of meaning conveyed by metalinguistic demonstrations, namely ‘pictorial’ and ‘conventional linguistic’ meaning, something that had not been done with that clarity before. Besides, he has had the wisdom to give up the standard assumption that all metalinguistic demonstrations are referential, an assumption that inevitably led to theoretical dead ends. Moreover, drawing on the first two insights, Recanati has also separated out the syntactic and pragmatic aspects that were often confused in previous approaches.There is no doubt that the theory put forward by Recanati in 2001 is the most empirically adequate that can be found in the literature. Besides, it also accounts for an impressive range of key properties. Still, there are two interesting properties that received very little attention from Recanati, that is, referential diversity and recursiveness. Though Paul Saka has argued in favour of both in a 1998 paper, I believe his defence to have been somewhat clumsy. And therefore I have tried to offer more convincing evidence in favour of these properties.Let’s start with ReferenceAs Recanati has shown, not all metalinguistic demonstrations are referential expressions. But there is one aspect of reference that he says very little about: the sort or sorts of referents that a referential autonym can have. The theory implicitly suggests that autonyms can only refer to types. (Many writers have claimed more robustly and more explicitlythat there was only one sort of referents for autonyms, always either types or classes of tokens).I hold this view to be incorrect. As I’ve indicated in Chapter 4 of the thesis, I believe that several sorts of referents must be distinguished. Let us have a few examples:Run is a verbRun has three lettersShe said, “I ain’t EVER gonna tell ya”The first refers to a lexeme, since the predicate applies to runs, ran, running, as well.The second, only to a form (since not true of running or runs).Both could still be said to be abstract objects, and one might wish to call these ‘types’.The third, however, well and truly seems to refer to a token, the particular utterance produced by the woman behind she, witness the mimicry involved in the direct speech report.In my discussion of the next property, I offer a further argument in favour of referential diversity.2. Metalinguistic demonstrations can be iterated (repeated), a property usually described as recursiveness, and which has given rise to some controversies. Some demonstrativists, notably Cappelen & Lepore, because they hold the interior of a quotation to be semantically inert, have rejected the idea of recursiveness. I think, however, that their rejection comes from their failure to discern several types of recursiveness. In my dissertation, I have distinguished three; I shall only sketch two here.“ ‘Boston’ ” is an autonym.Typographical recursiveness: hardly very interesting, since it is a mechanical operation that can be repeated at will.The next pair of examples throws a more interesting light on the matter:‘Boston’ is a six-letter word.In each utterance of the previous example, “ ‘Boston’ ” is used to refer to an orthographic formBoston enclosed in two pairs of quote marks refers to particular tokens of Boston in a single pair of quote marks, as are produced when uttering a token of the first sentence, ‘Boston’ is a six-letter word. In each utterance of that sentence, the subject, ‘Boston’, itself refers, this time to the name Boston. This means that we have a situation in which an autonym refers to another autonym which also refers: reference here is iterated.This is actually no problem for the assumption of the inertness of the interior of the quotation, because reference is directed outwards: the interior of the quotation itself (the token displayed) remains inert. Note that referential recursiveness is only possible when one has a meta-quotation that refers to a token that is itself a referential autonym. This confirms the need for the theory to accommodate reference to particular tokens.I have made further use of the theory of metalinguistic demonstrations in Chapter 6 of the thesis, which is devoted to sketching a typology of metalinguistic demonstrations. In this connection, I have tried to bring together different types of discriminating factors that had been used in previous classifications (syntactic, semantic, pragmatic, typographical, lexical). These did not seem to be compatible from the outset, but then I realised that they might perhaps all be integrated into a single typology if I adopted an interpreter’s perspective. I reflected that that perspective provided a criterion for determining which characteristics of metalinguistic demonstrations would count as relevant variables for a typology: only those that made a ‘difference for the interpreter’ (i.e. affected his/her interpretative processes) would be retained.I also took advanatge of the general theory for the interpretation of utterances that has been set out in some recent publications, notably by Bach and Recanati (and which I outline in Chapter 3 of the thesis), and eventually reached what I regard as a decent result. Moreover, I also made a couple of interesting discoveries. The first one is that quite a bit of the interpretation of an utterance takes place at a ‘pre-interpretative’ level, that is, befor a sentence has been clearly identified (disambiguated). In particular, there are significant pictorial aspects of metalinguistic demonstrations that enter into the disambiguation process rather than into interpretation proper. The second one is that there is an impressive number of aspects of meaning that are linked to the speaker’s intentions, and should theoretically require access to the wide context of an utterance to be processed, that can be accessed at very low (semantic) levels of interpretation.In the final part of this presentation, I wish to examine a couple of instances of hybridity that face the theory with a more serious challenge than example 5 on the first slide. That example was easily explained in terms of simultaneous use and mention (the standard account in the literature): the same sequence, military action, was used ordinarily and, secondarily, demonstrated as being a particular form of euphemism. Other hybrids, on the other hand, do not lend themselves to such an analysis in a straightforward way. The first example I wish to bring up raises an interesting problem in connection with the notion of grammaticality:Robbe-Grillet describes himself in his introduction as “volontiers professeur de moi-même”.This can be rewritten as a pair of sentences, one for use the other for mention. We get:Use :Robbe-Grillet describes himself in his introduction as volontiers professeur de moi-même.Mention :Robbe-Grillet uses the expression “volontiers professeur de moi-même”.Although the mention line raises no special issues, there are great doubts as to the grammaticality of the ordinary-use line: a language-shift occurs in the middle of the sentence, and is not signalled by any marker, unlike in the initial hybrid. Though Recanati’s framework allows for language-shifts, and could therefore be relied on to argue that the correct interpretation can be ascribed to the French words in the example, it does not state rules determining at which spot in an utterance such a shift is acceptable grammatically. In other words, it says nothing about the possibility of a grammar that would straddle English and French. Fortunately, the idea of such grammars is supported by the limited research that has been carried out about code-switching. So, there may be theoretical backing for the assumption that the use line may after all be grammatical (with respect to a hybrid grammar).Note that these remarks are valid, I believe, not just for the use line of the twofold paraphrase, but for the initial hybrid too. Indeed, it is not clear — though some would be ready to say so — that the presence of quote marks is enough to alter the grammaticality of an utterance.Note also that an example like the previous one is a reminder of an essential fact about the work of language scholars: they start out to describe and/or explain some empirical data they find significant. But as things get more complicated, they must continually make decisions as to what must be acknowledged as relevant data for their research. Every step of the way, there may be a temptation to dismiss data — in the present case, on grounds of ungrammaticality — because these data threaten the validity of the theory being devised. Here, thanks to an analogy with grammatical accounts of code-switching, a case can be made for the grammaticality of utterances like the one under consideration. It is these kinds of extensions that broaden the linguist’s horizons and make research worthwhile.The second example I wish to examine raises interesting issues concerning iconicity. Though I have said nothing about it so far, iconicity is perhaps the single most important notion in any discussion of metalinguistic demonstration. In a nutshell, the basic assumption about ‘how such a demonstration makes sense’ is that the tokens displayed in a mentioning utterance are iconically related to the target of the demonstration. Iconicity can initially be understood as a matter of formal resemblance (cf the first batch of examples on Slide 1). The following example shows that the notion must be made more flexble than that:Descartes said that man “is a thinking substance”.Use: Descartes said that man is a thinking substance.Mention: Descartes said “is a thinking substance”.It can be seen that the mention line of the paraphrase is truth-conditionally incorrect: Descartes did not produce a token of is a thinking substance, since he was writing in Latin, not English. What Descartes said was est res cogitans. This might be taken to imply that the relation between the English tokens displayed and the Latin target is not a matter of iconicity. I would, with several other writers, suggest another direction: There is iconicity in this example, but the concept must be understood to be flexible and adaptable to contextual constraints. I believe such a conception to be necessary if one wants to be able to account for metalinguistic demonstrations within a single explanatory framework. There are too many instances of quotations that are not supported by formal identity to maintain a rigid notion of iconicity. I have added a last example on the slideConclusionAlthough I originally aspired to a comprehensive survey of things metalinguistic, I cannot but concede that there are still multiple aspects of the reflexive use of language that need looking into. I believe, however, that I have been able to shed some light on some areas of the debate. For instance, I believe that my discussion of the recursiveness and referential diversity of autonyms goes one step further than previous discussions. In particular, I hope to have been able to show convincingly that, contrary to a widespread opinion, an autonym can refer to various object, notably individual tokens. When these results are added to an excellent theory like Recanati’s, one ends up with a powerful explanatory apparatus. Moreover, this apparatus has the added advantage that it can easily be integrated into the general theory for the interpretation of utterances which I have alluded to before.I have taken advantage of this compatibility to outline my interpreter’s typology of metalinguistic demonstrations. Whether that effort was entirely successful or not, I think it has incidentally provided an excellent testing ground for the general theory. If only in that respect, the attempt was worth a try, since it shed light on the importance of pre-interpretative processes and on the conventional encoding of aspects of meaning that are otherwise heavily dependent on speaker’s intentions.Finally, I believe that the work doen in Chapter 8 has brought to the fore a number of question that deserve to be investigated at greater length in future. There are still dark areas in the study of world/language hybrids, but there also more general questions, e.g. regarding grammaticality and iconicity that need looking into. / Doctorat en philosophie et lettres, Orientation langue et littérature / info:eu-repo/semantics/nonPublished Linguistique générale Philosophie analytique Sémantique langages naturels Pragmatique use and mention metalinguistic autonym quotation hybrid quotation depiction iconic
3	The Mechanics of Indirectness: A Case Study of Directive Speech Acts Ruytenbeek, Nicolas 02 March 2017 (has links) This dissertation investigates the comprehension of indirect requests (IRs). Focusing on English and French, it proposes that IRs such as Can you + verbal phrase (for short, Can you VP?) achieve an optimal communicative efficiency because, while they entail extra processing costs, they match the expected level of politeness in many contexts. The approach taken combines Talmy’s force dynamic semantics with a traditional perspective in philosophy of language drawing on speech act theory. First, I sketch a theoretically viable and empirically plausible definition of directive speech acts, and provide a naturalistic explanation of why directives result in obligations for addressees. According to this definition, a directive speech act consists in a force exerted by a speaker towards an addressee’s performance of some action, with a prima facie obligation created for the addressee as a result. Consistently with this definition, I propose that imperative sentences are a convenient means to perform directives insofar as they encode a force dynamic pattern that is compatible with, but distinct from, the force exertion pattern that characterizes directives. I develop a similar analysis for You should/must VP declarative sentences. By contrast, I argue that, if interrogative sentences can be used in the performance of directives such as questioning, they do so by virtue of their incompleteness.To satisfactorily account for the variety of utterances that can be used as directives, I propose a typology based on the formal criterion of (in)directness and on the processing criterion of primariness/secondariness. Three factors are furthermore predicted to influence the processing of IRs: conventionality of means, degree of standardization, and degree of illocutionary force salience. This typology underpins an exhaustive review of experimental work on the comprehension of directives, in which I conclude that further investigation into the processing of IRs is necessary. In particular, the influence of these three factors on the processing, and, in particular, on the primariness/secondariness of IRs is left unexplored.In three eye-tracking experiments with native speakers of French, I put to the test four hypotheses. First, I hypothesize that the more an expression is standardized for the performance of IRs, the more likely it will be understood as an IR, and the more likely the IR will be primary rather than secondary. Second, because expressions such as Can you VP? used as IRs also have a direct interpretation, they should entail extra processing costs relative to their imperative and interrogative direct counterparts. Third, assuming they are direct, You must VP requests should be understood like imperatives requests, and they should not activate the assertive force. Fourth, the high degree of directive illocutionary force salience contributed by the adverb please should increase the likelihood of an IR interpretation and the likelihood that the IR will be primary. In Experiment 1, I show that IR interpretations tend to be more frequent for highly standardized IRs relative to their less standardized counterparts. I also demonstrate that interpreting the highly standardized Can you VP? and the less standardized Is it possible to VP? as IRs does not activate their “ability question” illocutionary meaning. The same finding holds, in Experiment 2, for the declaratives You can VP and It is possible to VP. The data of Experiment 2 indicate that, like imperative sentences, You must VP does not activate the assertive illocutionary force. Another finding of Experiment 1 is that Can you VP? and Is it possible to VP? can be understood as primary IRs, but these expressions nonetheless impose extra processing costs when they are interpreted as direct questions. In Experiment 3, I find that the high degree of directive force salience contributed by please increases the likelihood of an IR interpretation regardless of the degree of standardization of the expression. However, the presence of please has no significant influence on the processing of IRs.Turning to the production of directives, I address the issue of why speakers use IRs despite the extra processing costs entailed by these expressions. In a production task experiment where addressee status is manipulated, I test the hypothesis that Can you VP? IRs are used to trigger extra politeness effects absent in imperatives. A second hypothesis is that speakers should avoid imperatives and obligation declaratives such as You should/must VP because these request forms are directly compatible with force exertion at the pragmatic level. Rather, they should prefer indirect request forms such as ability interrogatives. Third, Can you VP? it should be more frequent than Is it possible to VP? in the data. A first important finding is that higher addressee status does not increase the frequency of Can you VP? interrogatives relative to other request forms. Instead of using Can you VP? more often when they address higher status people, speakers use specific politeness markers, which disconfirms the hypothesis that Can you VP? is used to convey extra politeness effects. The second hypothesis is confirmed, insofar as the data collected with this production task contain a vast majority of ability interrogatives, and imperatives and obligation declaratives are absent. Third, in line with the standardization hypothesis, Can you VP? occur much more often in the data than Is it possible to VP?. / Doctorat en Langues, lettres et traductologie / info:eu-repo/semantics/nonPublished Sémantique langages naturels Sociolinguistique Psycholinguistique Linguistique générale Langue et littérature françaises

1

Page generated in 0.0776 seconds