Natural language (NL) is a common medium humans use to express ideas and communicate with others, while programming languages (PL) are the ``language'' humans use to communicate with machines.
As NL and PL were designed for different purposes, a considerable difference exists in the structure and capabilities.
Programming using PL can take novices months to learn. Meanwhile, users are already familiar with NL.
Therefore, natural language programming (NLPr) holds excellent potential by giving non-experts the ability to ``program'' with the language they already know and a Low-Code/No-Code development experience.
However, many challenges with developing NLPr systems are yet to be addressed, namely how to disambiguate NL semantics, validate inputs and provide helpful feedback, and generate the executable programs based on semantic meanings effectively.
This dissertation addresses these issues by proposing a Controlled Object-Oriented Language (COOL) model to disambiguate and analyze the English inputs' semantic meanings and implement a LEGO robot NLPr platform.
Two main approaches that connect the current research in general-purpose NLP to NLPr are taken:
(1) A domain-specific lexicon and function library serve as the syntax and semantic space.
Even though NL can be complex and expressive, functions for the specific robot domain can be fulfilled with libraries built of a finite set of objects and functions.
(2) An error-reporting and feedback mechanism detects erroneous sentences, explains possible reasons, and provides debugging and rewriting suggestions.
The error-reporting and feedback systems are developed with a hybrid approach that combines rule-based methods such as FSM and dependency-based structural analysis with the data-based multi-label classification (MLC) method.
Experiment results and user studies show that, with the proposed model and approaches reducing the ambiguity within the target domain, the NLPr system can process a relatively expressive controlled NL for robot motion control and generate executable codes based on the English input.
When the system is confronted with erroneous sentences, it produces error messages, suggestions, and example sentences for users.
NL's structural and semantic information can be transformed into the intermediate representations used for program synthesis with the language model and system proposed to resolve the situation where the considerable amount of data needed for a data-based model is unavailable. / Doctor of Philosophy / Natural language (NL) is one of the most common mediums humans use daily to express and explain ideas and communicate with each other. In contrast, programming languages (PL) are the ``language'' humans use to communicate with machines.
Because of the difference in the purpose, media, and audience, there is a considerable difference in their structure and capabilities.
NL is more expressive and natural and sometimes can be rather complex, while PL is primarily short, straightforward, and not as expressive as NL.
The need for programming has increased in recent years.
However, the learning curve of programming languages can easily be months or more for novice users to learn.
At the same time, all potential users are familiar with at least one NL.
As such, natural language programming (NLPr), a technology that enables people to program with NL, holds excellent potential since it gives non-experts the ability to ``program'' with the language they already know and a Low-Code or even No-Code development experience.
However, despite recent research into NLPr, many challenges with developing NLPr systems are yet to be addressed, namely how to disambiguate natural language semantics, how to validate inputs and provide helpful feedback with a limited amount of data, and how to effectively generate the executable programs based on the semantic meanings.
This dissertation addresses these issues by proposing a Controlled Object-Oriented Language (COOL) model to disambiguate and analyze the English inputs' semantic meanings and implement a LEGO robot NLPr platform.
Two main approaches that connect the current research in general-purpose NLP techniques to NLPr are taken:
(1) The first is developing a domain-specific lexicon and function library with the designed COOL model to serve as the syntax and semantic space. Even though natural language can be extremely complex and expressive, the functions for the specific robot domain can be fulfilled with libraries built of a finite set of objects and functions.
(2) An error-reporting and feedback mechanism detects erroneous sentences, explains possible reasons, and provides debugging and rewriting suggestions.
The error-reporting and feedback systems are developed with a hybrid approach that combines rule-based methods such as FSM and dependency-based structural analysis with the data-based multi-label classification (MLC) method.
Experiment results and user studies show that, with the proposed language model and approaches reducing the ambiguity within the target domain, the designed NLPr system can process a relatively expressive controlled natural language designed for robot motion control and generate executable codes based on the semantic information extracted.
When the NLPr system is confronted with erroneous sentences, it produces detailed error messages and provides suggestions and sample sentences for possible fixes to users.
NL's structural and semantic information can be transformed into the intermediate representations used for program synthesis with the simple language model and system proposed to resolve the situation where the considerable amount of data needed for a data-based model is unavailable.
Identifer | oai:union.ndltd.org:VTETD/oai:vtechworks.lib.vt.edu:10919/111208 |
Date | 11 July 2022 |
Creators | Zhan, Yue |
Contributors | Electrical and Computer Engineering, Hsiao, Michael S., Min, Chang Woo, Huang, Bert, Zeng, Haibo, Schaumont, Patrick Robert |
Publisher | Virginia Tech |
Source Sets | Virginia Tech Theses and Dissertation |
Language | English |
Detected Language | English |
Type | Dissertation |
Format | ETD, application/pdf, application/pdf |
Rights | In Copyright, http://rightsstatements.org/vocab/InC/1.0/ |
Page generated in 0.0023 seconds