Global ETD Search

101	Efficient computation of advanced skyline queries. Yuan, Yidong, Computer Science & Engineering, Faculty of Engineering, UNSW January 2007 (has links) Skyline has been proposed as an important operator for many applications, such as multi-criteria decision making, data mining and visualization, and user-preference queries. Due to its importance, skyline and its computation have received considerable attention from database research community recently. All the existing techniques, however, focus on the conventional databases. They are not applicable to online computation environment, such as data stream. In addition, the existing studies consider efficiency of skyline computation only, while the fundamental problem on the semantics of skylines still remains open. In this thesis, we study three problems of skyline computation: (1) online computing skyline over data stream; (2) skyline cube computation and its analysis; and (3) top-k most representative skyline. To tackle the problem of online skyline computation, we develop a novel framework which converts more expensive multiple dimensional skyline computation to stabbing queries in 1-dimensional space. Based on this framework, a rigorous theoretical analysis of the time complexity of online skyline computation is provided. Then, efficient algorithms are proposed to support ad hoc and continuous skyline queries over data stream. Inspired by the idea of data cube, we propose a novel concept of skyline cube which consists of skylines of all possible non-empty subsets of a given full space. We identify the unique sharing strategies for skyline cube computation and develop two efficient algorithms which compute skyline cube in a bottom-up and top-down manner, respectively. Finally, a theoretical framework to answer the question about semantics of skyline and analysis of multidimensional subspace skyline are presented. Motived by the fact that the full skyline may be less informative because it generally consists of a large number of skyline points, we proposed a novel skyline operator -- top-k most representative skyline. The top-k most representative skyline operator selects the k skyline points so that the number of data points, which are dominated by at least one of these k skyline points, is maximized. To compute top-k most representative skyline, two efficient algorithms and their theoretical analysis are presented. Database management. Database design. Question-answering systems. Semantics - Data processing.
102	Lexical approaches to backoff in statistical parsing Lakeland, Corrin, n/a January 2006 (has links) This thesis develops a new method for predicting probabilities in a statistical parser so that more sophisticated probabilistic grammars can be used. A statistical parser uses a probabilistic grammar derived from a training corpus of hand-parsed sentences. The grammar is represented as a set of constructions - in a simple case these might be context-free rules. The probability of each construction in the grammar is then estimated by counting its relative frequency in the corpus. A crucial problem when building a probabilistic grammar is to select an appropriate level of granularity for describing the constructions being learned. The more constructions we include in our grammar, the more sophisticated a model of the language we produce. However, if too many different constructions are included, then our corpus is unlikely to contain reliable information about the relative frequency of many constructions. In existing statistical parsers two main approaches have been taken to choosing an appropriate granularity. In a non-lexicalised parser constructions are specified as structures involving particular parts-of-speech, thereby abstracting over individual words. Thus, in the training corpus two syntactic structures involving the same parts-of-speech but different words would be treated as two instances of the same event. In a lexicalised grammar the assumption is that the individual words in a sentence carry information about its syntactic analysis over and above what is carried by its part-of-speech tags. Lexicalised grammars have the potential to provide extremely detailed syntactic analyses; however, Zipf�s law makes it hard for such grammars to be learned. In this thesis, we propose a method for optimising the trade-off between informative and learnable constructions in statistical parsing. We implement a grammar which works at a level of granularity in between single words and parts-of-speech, by grouping words together using unsupervised clustering based on bigram statistics. We begin by implementing a statistical parser to serve as the basis for our experiments. The parser, based on that of Michael Collins (1999), contains a number of new features of general interest. We then implement a model of word clustering, which we believe is the first to deliver vector-based word representations for an arbitrarily large lexicon. Finally, we describe a series of experiments in which the statistical parser is trained using categories based on these word representations. parsing (computer grammar) computational linguistics linguistics statistical methods
103	Natural language program analysis combining natural language processing with program analysis to improve software maintenance tools / Shepherd, David. January 2007 (has links) Thesis (Ph.D.)--University of Delaware, 2007. / Principal faculty advisors: Lori L. Pollock and Vijay K. Shanker, Dept. of Computer & Information Sciences. Includes bibliographical references.
104	Efficient computation of advanced skyline queries. Yuan, Yidong, Computer Science & Engineering, Faculty of Engineering, UNSW January 2007 (has links) Skyline has been proposed as an important operator for many applications, such as multi-criteria decision making, data mining and visualization, and user-preference queries. Due to its importance, skyline and its computation have received considerable attention from database research community recently. All the existing techniques, however, focus on the conventional databases. They are not applicable to online computation environment, such as data stream. In addition, the existing studies consider efficiency of skyline computation only, while the fundamental problem on the semantics of skylines still remains open. In this thesis, we study three problems of skyline computation: (1) online computing skyline over data stream; (2) skyline cube computation and its analysis; and (3) top-k most representative skyline. To tackle the problem of online skyline computation, we develop a novel framework which converts more expensive multiple dimensional skyline computation to stabbing queries in 1-dimensional space. Based on this framework, a rigorous theoretical analysis of the time complexity of online skyline computation is provided. Then, efficient algorithms are proposed to support ad hoc and continuous skyline queries over data stream. Inspired by the idea of data cube, we propose a novel concept of skyline cube which consists of skylines of all possible non-empty subsets of a given full space. We identify the unique sharing strategies for skyline cube computation and develop two efficient algorithms which compute skyline cube in a bottom-up and top-down manner, respectively. Finally, a theoretical framework to answer the question about semantics of skyline and analysis of multidimensional subspace skyline are presented. Motived by the fact that the full skyline may be less informative because it generally consists of a large number of skyline points, we proposed a novel skyline operator -- top-k most representative skyline. The top-k most representative skyline operator selects the k skyline points so that the number of data points, which are dominated by at least one of these k skyline points, is maximized. To compute top-k most representative skyline, two efficient algorithms and their theoretical analysis are presented. Database management. Database design. Question-answering systems. Semantics - Data processing.
105	Chinese to English machine translation using SNePS as an interlingua Liao, Min-Hung. January 1997 (has links) Thesis (M.A.)--State University of New York at Buffalo, 1997. / Includes bibliographical references (leaves 172-174). Also available in print.
106	Understanding acknowledments / Ward, Karen, January 2001 (has links) Thesis (Ph. D.)--Oregon Graduate Institute, 2001.
107	Generating documents by means of computational registers Oldham, Joseph Dowell. January 2000 (has links) (PDF) Thesis (Ph. D.)--University of Kentucky, 2000. / Title from document title page. Document formatted into pages; contains ix, 169 p. : ill. Includes abstract. Includes bibliographical references (p. 160-167).
108	The use of prosodic features in Chinese speech recognition and spoken language processing / Wong, Jimmy Pui Fung. January 2003 (has links) Thesis (M.Phil.)--Hong Kong University of Science and Technology, 2003. / Includes bibliographical references (leaves 97-101). Also available in electronic version. Access restricted to campus users.
109	Flexible semantic matching of rich knowledge structures Yeh, Peter Zei-Chan 28 August 2008 (has links) Not available / text Semantics--Data processing Semantics--Data processing--Case studies
110	Following natural language route instructions MacMahon, Matthew Tierney 28 August 2008 (has links) Following natural language instructions requires transforming language into situated conditional procedures; robustly following instructions, despite the director's natural mistakes and omissions, requires the pragmatic combination of language, action, and domain knowledge. This dissertation demonstrates a software agent that parses, models and executes human-written natural language instructions to accomplish complex navigation tasks. We compare the performance against people following the same instructions. By selectively removing various syntactic, semantic, and pragmatic abilities, this work empirically measures how often these abilities are necessary to correctly navigate along extended routes through unknown, large-scale environments to novel destinations. To study how route instructions are written and followed, this work presents a new corpus of 1520 free-form instructions from 30 directors for 252 routes in three virtual environments. 101 other people followed these instructions and rated them for quality, successfully reaching and identifying the destination on only approximately two-thirds of the trials. Our software agent, MARCO, followed the same instructions in the same environments with a success rate approaching human levels. Overall, instructions subjectively rated 4 or better of 6 comprise just over half of the corpus; MARCO performs at 88% of human performance on these instructions. MARCO's performance was a strong predictor of human performance and ratings of individual instructions. Ablation experiments demonstrate that implicit procedures are crucial for following verbal instructions using an approach integrating language, knowledge and action. Other experiments measure the performance impact of linguistic, execution, and spatial abilities in successfully following natural language route instructions.

Search results