Spelling suggestions: "subject:"contextfree grammar"" "subject:"context.three grammar""
1 |
A Mixed-Response Intelligent Tutoring System Based on Learning from DemonstrationAlvarez Xochihua, Omar 2012 May 1900 (has links)
Intelligent Tutoring Systems (ITS) have a significant educational impact on student's learning. However, researchers report time intensive interaction is needed between ITS developers and domain-experts to gather and represent domain knowledge. The challenge is augmented when the target domain is ill-defined. The primary problem resides in often using traditional approaches for gathering domain and tutoring experts' knowledge at design time and conventional methods for knowledge representation built for well-defined domains. Similar to evolving knowledge acquisition approaches used in other fields, we replace this restricted view of ITS knowledge learning merely at design time with an incremental approach that continues training the ITS during run time. We investigate a gradual knowledge learning approach through continuous instructor-student demonstrations. We present a Mixed-response Intelligent Tutoring System based on Learning from Demonstration that gathers and represents knowledge at run time. Furthermore, we implement two knowledge representation methods (Weighted Markov Models and Weighted Context Free Grammars) and corresponding algorithms for building domain and tutoring knowledge-bases at run time.
We use students' solutions to cybersecurity exercises as the primary data source for our initial framework testing. Five experiments were conducted using various granularity levels for data representation, multiple datasets differing in content and size, and multiple experts to evaluate framework performance. Using our WCFG-based knowledge representation method in conjunction with a finer data representation granularity level, the implemented framework reached 97% effectiveness in providing correct feedback. The ITS demonstrated consistency when applied to multiple datasets and experts. Furthermore, on average, only 1.4 hours were needed by instructors to build the knowledge-base and required tutorial actions per exercise. Finally, the ITS framework showed suitable and consistent performance when applied to a second domain.
These results imply that ITS domain models for ill-defined domains can be gradually constructed, yet generate successful results with minimal effort from instructors and framework developers. We demonstrate that, in addition to providing an effective tutoring performance, an ITS framework can offer: scalability in data magnitude, efficiency in reducing human effort required for building a confident knowledge-base, metacognition in inferring its current knowledge, robustness in handling different pedagogical and tutoring criteria, and portability for multiple domain use.
|
2 |
Musical Phrase Segmentation via Grammatical InductionPerkins, Reed James 06 April 2022 (has links)
Procedural generation algorithms can infer rules based on a dataset of examples when each example is made up of labeled components. Unfortunately, musical sequences resist potential inclusion in these kinds of datasets because they lack explicit structural semantics. In order to algorithmically transform a musical sequence into a sequence of labeled components, a segmentation process is needed. We outline a solution to the challenge of musical phrase segmentation that uses grammatical induction algorithms, a class of algorithms which infer a context-free grammar from an input sequence. We study five different grammatical induction algorithms on three different datasets, one of which is introduced in this work. Additionally, we test how the performance of each algorithm varies when transforming musical sequences using viewpoint combinations. Our experiments show that the algorithm longestFirst achieves the best F1 scores across all three datasets, and that viewpoint combinations which include the duration viewpoint result in the best performance.
|
3 |
Intensional Context-Free GrammarLittle, Richard 02 January 2014 (has links)
The purpose of this dissertation is to develop a new generative grammar, based on the principles of intensional logic. More specifically, the goal is to create a psychologically real grammar model for use in natural language processing. The new grammar consists of a set of context-free rewrite rules tagged with intensional versions.
Most generative grammars, such as transformational grammar, lexical functional-grammar and head-driven phrase structure grammar, extend traditional context-free grammars with a mechanism for dealing with contextual information, such as subcategorization of words and agreement between different phrasal elements. In these grammars there is not enough separation between the utterances of a language and the context in which they are uttered. Their models of language seem to assume that context is in some way encapsulated in the words of the language instead of the other way around.
In intensional logic, the truth of a statement is considered in the context in which it is uttered, unlike traditional predicate logic in which truth is assigned in a vacuum, regardless of when or where it may have been stated. To date, the application of the principles of intensionality to natural languages has been confined to semantic theory. We remedy this by applying the ideas of intensional logic to syntactic context, resulting in intensional context-free grammar.
Our grammar takes full advantage of the simplicity and elegance of context-free grammars while accounting for information that is beyond the sentence itself, in a realistic way. Sentence derivation is entirely encapsulated in the context of its utterance. In fact, for any particular context, the entire language of the grammar is encapsulated in that context. This is evidenced by our proof that the language of an intensional grammar is a set of context-free languages, indexed by context.
To further support our claims we design and implement a small fragment of English using the grammar. The English grammar is capable of generating both passive and active sentences that include a subject, verb and up to two optional objects. Furthermore, we have implemented a partial French to English translation system that uses a single language dimension to initiate a translation. This allows us to include multiple languages in one grammar, unlike other systems which must separate the grammars of each language. This result has led this author to believe that we have created a grammar that is a viable candidate for a true Universal Grammar, far exceeding our initial goals. / Graduate / 0984 / 0800 / 0290 / rlittle@uvic.ca
|
4 |
Ambiguity Detection for Programming Language GrammarsBasten, Bas 15 December 2011 (has links) (PDF)
Context-free grammars are the most suitable and most widely used method for describing the syntax of programming languages. They can be used to generate parsers, which transform a piece of source code into a tree-shaped representation of the code's syntactic structure. These parse trees can then be used for further processing or analysis of the source text. In this sense, grammars form the basis of many engineering and reverse engineering applications, like compilers, interpreters and tools for software analysis and transformation. Unfortunately, context-free grammars have the undesirable property that they can be ambiguous, which can seriously hamper their applicability. A grammar is ambiguous if at least one sentence in its language has more than one valid parse tree. Since the parse tree of a sentence is often used to infer its semantics, an ambiguous sentence can have multiple meanings. For programming languages this is almost always unintended. Ambiguity can therefore be seen as a grammar bug. A specific category of context-free grammars that is particularly sensitive to ambiguity are character-level grammars, which are used to generate scannerless parsers. Unlike traditional token-based grammars, character-level grammars include the full lexical definition of their language. This has the advantage that a language can be specified in a single formalism, and that no separate lexer or scanner phase is necessary in the parser. However, the absence of a scanner does require some additional lexical disambiguation. Character-level grammars can therefore be annotated with special disambiguation declarations to specify which parse trees to discard in case of ambiguity. Unfortunately, it is very hard to determine whether all ambiguities have been covered. The task of searching for ambiguities in a grammar is very complex and time consuming, and is therefore best automated. Since the invention of context-free grammars, several ambiguity detection methods have been developed to this end. However, the ambiguity problem for context-free grammars is undecidable in general, so the perfect detection method cannot exist. This implies a trade-off between accuracy and termination. Methods that apply exhaustive searching are able to correctly find all ambiguities, but they might never terminate. On the other hand, approximative search techniques do always produce an ambiguity report, but these might contain false positives or false negatives. Nevertheless, the fact that every method has flaws does not mean that ambiguity detection cannot be useful in practice. This thesis investigates ambiguity detection with the aim of checking grammars for programming languages. The challenge is to improve upon the state-of-the-art, by finding accurate enough methods that scale to realistic grammars. First we evaluate existing methods with a set of criteria for practical usability. Then we present various improvements to ambiguity detection in the areas of accuracy, performance and report quality. The main contributions of this thesis are two novel techniques. The first is an ambi- guity detection method that applies both exhaustive and approximative searching, called AMBIDEXTER. The key ingredient of AMBIDEXTER is a grammar filtering technique that can remove harmless production rules from a grammar. A production rule is harmless if it does not contribute to any ambiguity in the grammar. Any found harmless rules can therefore safely be removed. This results in a smaller grammar that still contains the same ambiguities as the original one. However, it can now be searched with exhaustive techniques in less time. The grammar filtering technique is formally proven correct, and experimentally validated. A prototype implementation is applied to a series of programming language grammars, and the performance of exhaustive detection methods are measured before and after filtering. The results show that a small investment in filtering time can substantially reduce the run-time of exhaustive searching, sometimes with several orders of magnitude. After this evaluation on token-based grammars, the grammar filtering technique is extended for use with character-level grammars. The extensions deal with the increased complexity of these grammars, as well as their disambiguation declarations. This enables the detection of productions that are harmless due to disambiguation. The extentions are experimentally validated on another set of programming language grammars from practice, with similar results as before. Measurements show that, even though character-level grammars are more expensive to filter, the investment is still very worthwhile. Exhaustive search times were again reduced substantially. The second main contribution of this thesis is DR. AMBIGUITY, an expert system to help grammar developers to understand and solve found ambiguities. If applied to an ambiguous sentence, DR. AMBIGUITY analyzes the causes of the ambiguity and proposes a number of applicable solutions. A prototype implementation is presented and evaluated with a mature Java grammar. After removing disambiguation declarations from the grammar we analyze sentences that have become ambiguous by this removal. The results show that in all cases the removed filter is proposed by DR. AMBIGUITY as a possible cure for the ambiguity. Concluding, this thesis improves ambiguity detection with two novel methods. The first is the ambiguity detection method AMBIDEXTER that applies grammar filtering to substantially speed up exhaustive searching. The second is the expert system DR. AMBIGUITY that automatically analyzes found ambiguities and proposes applicable cures. The results obtained with both methods show that automatic ambiguity detection is now ready for realistic programming language grammars.
|
5 |
Formalization of context-free language theoryRAMOS, Marcus Vinícius Midena 18 January 2016 (has links)
Submitted by Fabio Sobreira Campos da Costa (fabio.sobreira@ufpe.br) on 2016-08-08T13:11:15Z
No. of bitstreams: 2
license_rdf: 1232 bytes, checksum: 66e71c371cc565284e70f40736c94386 (MD5)
tese.pdf: 4855618 bytes, checksum: 717d268b142705bdc8ce106731a257db (MD5) / Made available in DSpace on 2016-08-08T13:11:15Z (GMT). No. of bitstreams: 2
license_rdf: 1232 bytes, checksum: 66e71c371cc565284e70f40736c94386 (MD5)
tese.pdf: 4855618 bytes, checksum: 717d268b142705bdc8ce106731a257db (MD5)
Previous issue date: 2016-03-28 / CAPEs / Proof assistants are software-based tools that are used in the mechanization of proof
construction and validation in mathematics and computer science, and also in certified program
development. Different such tools are being increasingly used in order to accelerate and simplify
proof checking, and the Coq proof assistant is one of the most known and used. Language and
automata theory is a well-established area of mathematics, relevant to computer science foundations
and information technology. In particular, context-free language theory is of fundamental
importance in the analysis, design and implementation of computer programming languages.
This work describes a formalization effort, using the Coq proof assistant, of fundamental results
of the classical theory of context-free grammars and languages. These include closure properties
(union, concatenation and Kleene star), grammar simplification (elimination of useless symbols,
inaccessible symbols, empty rules and unit rules), the existence of a Chomsky Normal Form for
context-free grammars and the Pumping Lemma for context-free languages. To achieve this,
several steps had to be fulfilled, including (i) understanding of the characteristics, importance and
benefits of mathematical formalization, specially in computer science, (ii) familiarization with
the underlying mathematical theories used in proof assistants, (iii) familiarization with the Coq
proof assistant, (iv) review of the strategies used in the informal proofs of the main results of the
context-free language theory and finally (iv) selection and adequation of the representation and
proof strategies adopted in order the achieve the desired objectives. The result is an important
set of libraries covering the main results of context-free language theory, with more than 500
lemmas and theorems fully proved and checked. This is probably the most comprehensive
formalization of the classical context-free language theory in the Coq proof assistant done to the
present date, and includes the remarkable result that is the formalization of the Pumping Lemma
for context-free languages. The perspectives for the further development of this work are diverse
and can be grouped in three different areas: inclusion of new devices and results, code extraction
and general enhancements of its libraries. / Assistentes de prova são ferramentas de software que são usadas na mecanização da
construção e da validação de provas na matemática e na ciência da computação, e também no
desenvolvimento de programas certificados. Diferentes ferramentas estão sendo usadas de forma
cada vez mais frequente para acelerar e simplificar a verificação de provas, e o assistente de
provas Coq é uma das mais conhecidas e utilizadas. A teoria de linguagens e de autômatos
é uma área bem estabelecida da matemática, com relevância para os fundamentos da ciência
da computação e a tecnologia da informação. Em particular, a teoria das linguagens livres de
contexto é de fundamental importância na análise, no projeto e na implementação de linguagens
de programação de computadores. Este trabalho descreve um esforço de formalização, usando
o assistente de provas Coq, de resultados fundamentais da teoria clássica das gramáticas e
linguagens livres de contexto. Estes incluem propriedades de fechamento (união, concatenação
e estrela de Kleene), simplificação gramatical (eliminação de símbolos inúteis, de símbolos
inacessíveis, de regras vazias e de regras unitárias), a existência da Forma Normal de Chomsky
para gramáticas livres de contexto e o Lema do Bombeamento para linguagens livres de contexto.
Para alcançar estes resultados, diversas etapas precisaram ser cumpridas, incluindo (i) o
entendimento das características, da importância e dos benefícios da formalização matemática,
especialmente na ciência da computação, (ii) a familiarização com as teorias matemáticas fundamentais utilizadas pelos assistentes de provas, (iii) a familiarização com o assistente de provas
Coq, (iv) a revisão das estratégias usadas nas provas informais dos principais resultados da teoria
das linguagens livres de contexto e, finalmente, (v) a seleção e adequação das estratégias de
representação e prova adotadas para permitir o alcance dos resultados pretendidos. O resultado é
um importante conjunto de bibliotecas cobrindo os principais resultados da teoria das linguagens
livres de contexto, com mais de 500 lemas e teoremas totalmente provados e verificados. Esta
é provavelmente a formalização mais abrangente da teoria clássica das linguagens livres de
contexto jamais feita no assistente de provas Coq, e inclui o importante resultado que é a formalização
do Lema do Bombeamento para linguagens livres de contexto. As perspectivas para
novos desenvolvimentos a partir deste trabalho são diversas e podem ser agrupadas em três áreas
diferentes: inclusão de novos dispositivos e resultados, extração de código e aprimoramentos
gerais das suas bibliotecas.
|
6 |
Métodos de pontos interiores como alternativa para estimar os parâmetros de uma gramática probabilística livre do contexto / Interior point methods as an alternative for estimating parameters of a stochastic context-free grammarMamián López, Esther Sofía, 1985- 10 July 2013 (has links)
Orientadores: Aurelio Ribeiro Leite de Oliveira, Fredy Angel Amaya Robayo / Dissertação (mestrado) - Universidade Estadual de Campinas, Instituto de Matemática, Estatística e Computação Científica / Made available in DSpace on 2018-08-23T17:46:00Z (GMT). No. of bitstreams: 1
MamianLopez_EstherSofia_M.pdf: 1176541 bytes, checksum: 8f49901f40e77c9511c30e86c0d1bb0d (MD5)
Previous issue date: 2013 / Resumo: Os modelos probabilísticos de uma linguagem (MPL) são modelos matemáticos onde é definida uma função de probabilidade que calcula a probabilidade de ocorrência de uma cadeia em uma linguagem. Os parâmetros de um MPL, que são as probabilidades de uma cadeia, são aprendidos a partir de uma base de dados (amostras de cadeias) pertencentes à linguagem. Uma vez obtidas as probabilidades, ou seja, um modelo da linguagem, existe uma medida para comparar quanto o modelo obtido representa a linguagem em estudo. Esta medida é denominada perplexidade por palavra. O modelo de linguagem probabilístico que propomos estimar, está baseado nas gramáticas probabilísticas livres do contexto. O método clássico para estimar os parâmetros de um MPL (Inside-Outside) demanda uma grande quantidade de tempo, tornando-o inviável para aplicações complexas. A proposta desta dissertação consiste em abordar o problema de estimar os parâmetros de um MPL usando métodos de pontos interiores, obtendo bons resultados em termos de tempo de processamento, número de iterações até obter convergência e perplexidade por palavra / Abstract: In a probabilistic language model (PLM), a probability function is defined to calculate the probability of a particular string ocurring within a language. These probabilities are the PLM parameters and are learned from a corpus (string samples), being part of a language. When the probabilities are calculated, with a language model as a result, a comparison can be realized in order to evaluate the extent to which the model represents the language being studied. This way of evaluation is called perplexity per word. The PLM proposed in this work is based on the probabilistic context-free grammars as an alternative to the classic method inside-outside that can become quite time-consuming, being unviable for complex applications. This proposal is an approach to estimate the PLM parameters using interior point methods with good results being obtained in processing time, iterations number until convergence and perplexity per word / Mestrado / Matematica Aplicada / Mestra em Matemática Aplicada
|
7 |
[en] CORRESPONDENCE BETWEEN PEGS AND CLASSES OF CONTEXT-FREE GRAMMARS / [pt] CORRESPONDÊNCIA ENTRE PEGS E CLASSES DE GRAMÁTICAS LIVRES DE CONTEXTOSERGIO QUEIROZ DE MEDEIROS 31 January 2011 (has links)
[pt] Gramáticas de Expressões de Parsing (PEGs) são um formalismo que
permite descrever linguagens e que possui como característica distintiva
o uso de um operador de escolha ordenada. A classe de linguagens descrita
por PEGs contém propriamente todas as linguagens livres de contexto determinísticas. Nesta tese discutimos a correspondência de PEGs com dois
outros formalismos usados para descrever linguagens: expressões regulares
e Gramáticas Livres de Contexto (CFGs). Apresentamos uma formalização
de expressões regulares usando semântica natural e mostramos uma transformação
para converter expressões regulares em PEGs que descrevem a
mesma linguagem; essa transformação pode ser facilmente adaptada para
acomodar diversas extensões usadas por bibliotecas de expressões regulares
(e.g., repetição preguiçosa e subpadrões independentes). Também apresentamos
uma nova formalização de CFGs usando semântica natural e
mostramos a correspondência entre CFGs lineares à direita e PEGs equivalentes.
Além disso, mostramos que gramáticas LL(1) com uma pequena
restrição descrevem a mesma linguagem quando interpretadas como CFGs
e quando interpretadas como PEGs. Por fim, mostramos como transformar
CFGs LL(k)-forte em PEGs equivalentes. / [en] Parsing Expression Grammars (PEGs) are a formalism that allow us to
describe languages and that has as its distinguishing feature the use of
an ordered choice operator. The class of languages described by PEGs
properly contains all deterministic context-free languages. In this thesis
we discuss the correspondence between PEGs and two other formalisms
used to describe languages: regular expressions and Context-Free Grammars
(CFGs). We present a new formalization of regular expressions that uses
natural semantics and we show a transformation to convert a regular
expression into a PEG that describes the same language; this transformation
can be easily adapted to accommodate several extensions used by regular
expression libraries (e.g., lazy repetition and independent subpatterns). We
also present a new formalization of CFGs that uses natural semantics and we
show the correspondence between right linear CFGs and equivalent PEGs.
Moreover, we show that LL(1) grammars with a minor restriction define the
same language when interpreted as a CFG and when interpreted as a PEG.
Finally, we show how to transform strong-LL(k) CFGs into PEGs that are
equivalent.
|
8 |
Grammar-Based Translation Framework / Grammar-Based Translation FrameworkVít, Radek January 2019 (has links)
V této práci prozkoumáváme existující algoritmy pro přijímání jazyků definovaných bezkontextovými gramatikami. Na základě těchto znalostí navrhujeme nový model pro reprezentaci LR automatů a s jeho pomocí definujeme nový algoritmus LSCELR. Modifikujeme algoritmy pro přijímání jazyků k vytvoření algoritmů pro překlad založený na překladových gramatikách. Definujeme atributové překladové gramatiky jako rozšířené překladové gramatiky pro definici vztahů mezi vstupními a výstupními symboly překladu. Implementujeme překladový framework ctf založený na gramatikách, který implementuje překlad pomocí LSCELR. Definujeme jazyk pro popis atributových překladových gramatik a implementujeme překladač pro překlad této reprezentace do zdrojového kódu pro implementovaný framework.
|
9 |
Řízená syntaktická analýza / Regulated ParsingWolf, Dominik January 2011 (has links)
This work deals with advanced models of context-free grammars and explores the possibilities of adaptation and usefulness for deterministic parsing of non-context-free sructures by deep parsing method. It introduces adapted model of context-free grammar named LL programmed grammar and adapted deep pushdown automaton that makes deterministic parsing of non-context-free structures possible.
|
10 |
Gramatické systémy a syntaxí řízený překlad založený na nich / Grammar Systems and Syntax-Directed Translation Based on ThemHandlíř, Jaroslav January 2017 (has links)
The thesis examines the theory of formal languages in the field of context-free grammars. It focuses mainly on the possibilities and models of collaborating grammars to solve a common problem. In this context, it presents grammatical systems that have been designed as a formal means for describing distributed and parallel processing. After introducing to the problematics, the thesis focuses on the practical use of these mechanisms in the translation controlled syntax, and therefore the second part of the thesis deals with the implementation of a dynamic syntactic analyzer that uses more grammars during the analysis. With respect to the greatest user friendliness and the possible didactic use, the application is implemented using modern web technologies HTML5, JavaScript, AngularJS, CSS3, LESS and more.
|
Page generated in 0.0524 seconds