Global ETD Search

91	Component-Based Entity Systems : Modular Object Construction and High Performance Gameplay Wallentin, Olof January 2014 (has links) This bachelor thesis examines design implementation and differences between game entity systems, with a focus on a component-based structure. How each will affect the other will be discussed from both a technical and design point of view, including possible drawbacks or advantages regarding game design iteration and performance. Since the focus is on component-based entity systems, a clarification on traditional entity systems are required, thus this thesis focuses on entity systems that are traditional, property-based, container-based, and aggravated component-based. The design and implementation of each system was founded from different generations of programming paradigms which resulted in specific compositional structure based on each specific era of hardware configuration. This thesis analyses the progress of hardware alongside game entity system design to further understand its progression and evolution into today’s standards and implementation. Details on each system is provided from a design perspective for the traditional entity system and with an in-depth view for the component-based entity systems. Programming game object entity C++ component game design. Design Design
92	Characterizing Concepts in Taxonomy for Entity Recommendations Cheekula, Siva Kumar 05 June 2017 (has links) No description available. Computer Science Taxonomies Entity Recommendations Knowledge-bases Characteristics of Taxonomy
93	Temporal Graph Record Linkage and k-Safe Approximate Match Jupin, Joseph January 2016 (has links) Since the advent of electronic data processing, organizations have accrued vast amounts of data contained in multiple databases with no reliable global unique identifier. These databases were developed by different departments for different purposes at different times. Organizing and analyzing these data for human services requires linking records from all sources. RL (Record Linkage) is a process that connects records that are related to the identical or a sufficiently similar entity from multiple heterogeneous databases. RL is a data and compute intensive, mission critical process. The process must be efficient enough to process big data and effective enough to provide accurate matches. We have evaluated an RL system that is currently in use by a local health and human services department. We found that they were using the typical approach that was offered by Fellegi and Sunter with tuple-by-tuple processing, using the Soundex as the primary approximate string matching method. The Soundex has been found to be unreliable both as a phonetic and as an approximate string matching method. We found that their data, in many cases, has more than one value per field, suggesting that the data were queried from a 5NF data base. Consider that if a woman has been married 3 times, she may have up to 4 last names on record. This query process produced more than one tuple per database/entity apparently generating a Cartesian product of this data. In many cases, more than a dozen tuples were observed for a single database/entity. This approach is both ineffective and inefficient. An effective RL method should handle this multi-data without redundancy and use edit-distance for approximate string matching. However, due to high computational complexity, edit-distance will not scale well with big data problems. We developed two methodologies for resolving the aforementioned issues: PSH and ALIM. PSH – The Probabilistic Signature Hash is a composite method that increases the speed of Damerau-Levenshtein edit-distance. It combines signature filtering, probabilistic hashing, length filtering and prefix pruning to increase the speed of edit-distance. It is also lossless because it does not lose any true positive matches. ALIM – Aggregate Link and Iterative Match is a graph-based record linkage methodology that uses a multi-graph to store demographic data about people. ALIM performs string matching as records are inserted into the graph. ALIM eliminates data redundancy and stores the relationships between data. We tested PSH for string comparison and found it to be approximately 6,000 times faster than DL. We tested it against the trie-join methods and found that they are up to 6.26 times faster but lose between 10 and 20 percent of true positives. We tested ALIM against a method currently in use by a local health and human services department and found ALIM to produce significantly more matches (even with more restrictive match criteria) and that ALIM ran more than twice as fast. ALIM handles the multi-data problem and PSH allows the use of edit-distance comparison in this RL model. ALIM is more efficient and effective than a currently implemented RL system. This model can also be expanded to perform social network analysis and temporal data modeling. For human services, temporal modeling can reveal how policy changes and treatments affect clients over time and social network analysis can determine the effects of these on whole families by facilitating family linkage. / Computer and Information Science Information Science Computer Science Entity Matching Record Linkage String Matching
94	Investigating and Recommending Co-Changed Entities for JavaScript Programs Jiang, Zijian January 2020 (has links) JavaScript (JS) is one of the most popular programming languages due to its flexibility and versatility, but debugging JS code is tedious and error-prone. In our research, we conducted an empirical study to characterize the relationship between co-changed software entities (e.g., functions and variables), and built a machine learning (ML)-based approach to recommend additional entity to edit given developers’ code changes. Specifically, we first crawled 14,747 commits in 10 open-source projects; for each commit, we created one or more change dependency graphs (CDGs) to model the referencer-referencee relationship between co-changed entities. Next, we extracted the common subgraphs between CDGs to locate recurring co-change patterns between entities. Finally, based on those patterns, we extracted code features from co-changed entities and trained an ML model that recommends entities-to-change given a program commit. According to our empirical investigation, (1) 50% of the crawled commits involve multi-entity edits (i.e., edits that touch multiple entities simultaneously); (2) three recurring patterns commonly exist in all projects; and (3) 80–90% of co-changed function pairs either invoke the same function(s), access the same variable(s), or contain similar statement(s); and (4) our ML-based approach CoRec recommended entity changes with high accuracy. This research will improve programmer productivity and software quality. / M.S. / This thesis introduced a tool CoRec which can provide co-change suggestions when JavaScript programmers fix a bug. A comprehensive empirical study was carried out on 14,747 multi-entity bug fixes in ten open-source JavaScript programs. We characterized the relationship between co-changed entities (e.g., functions and variables), and extracted the most popular change patterns, based on which we built a machine learning (ML)-based approach to recommend additional entity to edit given developers’ code changes. Our empirical study shows that: (1) 50% of the crawled commits involve multi-entity edits (i.e., edits that touch multiple entities simultaneously); (2) three change patterns commonly exist in all ten projects; (3) 80-90% of co-changed function pairs in the 3 patterns either invoke the same function(s), access the same variable(s), or contain similar statement(s); and (4) our ML-based approach CoRec recommended entity changes with high accuracy. Our research will improve programmer productivity and software quality. Multi-entity edit change suggestion Machine learning JavaScript
95	Towards Generalizable Information Extraction with Limited Supervision Wang, Sijia 18 September 2024 (has links) Supervised approaches, especially those employing deep neural networks, have showcased impressive performance, relying on a significant volume of manual annotations. However, their effectiveness encounters challenges when attempting to generalize to new languages, domains, or types, particularly in the absence of sufficient annotations. Current methods fall short in effectively addressing information extraction (IE) under limited supervision. In this dissertation, we approach information extraction with limited supervision from three perspectives. Firstly, we refine the previous classification-based extraction paradigm by introducing a query-and-extract framework, which uses target information as natural language queries to extract candidate information from the input text. Additionally, we leverage the excellent generation capability of large language models (LLMs) to produce high-quality annotation data, enriching IE semantics within limited annotation data. We also utilize LLMs' instruction-following capability to iteratively refine and optimize solutions through a debating process. Beyond text-only IE, we define a new multimodal IE task that links an entity mention within heterogeneous information sources to a knowledge base with limited annotation data. We demonstrate that excellent multimodal IE performance can be achieved, even with limited annotation data, by leveraging monomodal external information. These combined efforts aim to make optimal use of limited knowledge, ensuring more robust and generalizable solutions. / Doctor of Philosophy / This dissertation explores the development of information extraction (IE) algorithms and systems that work effectively with limited supervision. Information extraction is a complex and challenging task that involves extracting structured data from plain text. Traditional IE systems are often tailored to specific tasks and domains where ample annotated data is available, limiting their ability to adapt to new domains. This research focuses on developing IE systems that can generalize to new domains with limited supervision, reducing the reliance on extensive annotations. The proposed solutions demonstrate the potential to transfer knowledge from existing annotations to new tasks and domains, emphasizing the importance of learning from limited data and improving knowledge transfer to previously unknown domains. Information Extraction Limited Supervision Event Extraction Entity Linking
96	Odhad hodnoty firmy / Value estimate Stránská, Michaela January 2010 (has links) The goal of the diploma paper is to work up market values of GRUND Corp for the purpose of selling to a meanwhile unknown buyer in a form of experts' opinion. A strategic analysis was made before evaluating the company which is one of the key phases of company evaluating, because it makes the initial base for the following evaluation. The strategic analysis is followed by a financial analysis in which a general health of the plant is presented. Consequently, individual creating value generators are defined, and a financial plan of individual company statements is introduced. The DCF Entity method was used for evaluating the company itself, for completion and comparison the company was evaluated with the book-value method.
97	Person Name Recognition In Turkish Financial Texts By Using Local Grammar Approach Bayraktar, Ozkan 01 September 2007 (has links) (PDF) Named entity recognition (NER) is the task of identifying the named entities (NEs) in the texts and classifying them into semantic categories such as person, organization, and place names and time, date, monetary, and percent expressions. NER has two principal aims: identification of NEs and classification of them into semantic categories. The local grammar (LG) approach has recently been shown to be superior to other NER techniques such as the probabilistic approach, the symbolic approach, and the hybrid approach in terms of being able to work with untagged corpora. The LG approach does not require using any dictionaries and gazetteers, which are lists of proper nouns (PNs) used in NER applications, unlike most of the other NER systems. As a consequence, it is able to recognize NEs in previously unseen texts at minimal costs. Most of the NER systems are costly due to manual rule compilation especially in large tagged corpora. They also require some semantic and syntactic analyses to be applied before pattern generation process, which can be avoided by using the LG approach. In this thesis, we tried to acquire LGs for person names from a large untagged Turkish financial news corpus by using an approach successfully applied to a Reuter&rsquo / s financial English news corpus recently by H. N. Traboulsi. We explored its applicability to Turkish language by using frequency, collocation, and concordance analyses. In addition, we constructed a list of Turkish reporting verbs. It is an important part of this study because there is no major study about reporting verbs in Turkish.
98	Outomatiese Afrikaanse tekseenheididentifisering / deur Martin J. Puttkammer Puttkammer, Martin Johannes January 2006 (has links) An important core technology in the development of human language technology applications is an automatic morphological analyser. Such a morphological analyser consists of various modules, one of which is a tokeniser. At present no tokeniser exists for Afrikaans and it has therefore been impossible to develop a morphological analyser for Afrikaans. Thus, in this research project such a tokeniser is being developed, and the project therefore has two objectives: i)to postulate a tag set for integrated tokenisation, and ii) to develop an algorithm for integrated tokenisation. In order to achieve the first object, a tag set for the tagging of sentences, named-entities, words, abbreviations and punctuation is proposed specifically for the annotation of Afrikaans texts. It consists of 51 tags, which can be expanded in future in order to establish a larger, more specific tag set. The postulated tag set can also be simplified according to the level of specificity required by the user. It is subsequently shown that an effective tokeniser cannot be developed using only linguistic, or only statistical methods. This is due to the complexity of the task: rule-based modules should be used for certain processes (for example sentence recognition), while other processes (for example named-entity recognition) can only be executed successfully by means of a machine-learning module. It is argued that a hybrid system (a system where rule-based and statistical components are integrated) would achieve the best results on Afrikaans tokenisation. Various rule-based and statistical techniques, including a TiMBL-based classifier, are then employed to develop such a hybrid tokeniser for Afrikaans. The final tokeniser achieves an ∫-score of 97.25% when the complete set of tags is used. For sentence recognition an ∫-score of 100% is achieved. The tokeniser also recognises 81.39% of named entities. When a simplified tag set (consisting of only 12 tags) is used to annotate named entities, the ∫-score rises to 94.74%. The conclusion of the study is that a hybrid approach is indeed suitable for Afrikaans sentencisation, named-entity recognition and tokenisation. The tokeniser will improve if it is trained with more data, while the expansion of gazetteers as well as the tag set will also lead to a more accurate system / Thesis (M.A. (Applied Language and Literary Studies))--North-West University, Potchefstroom Campus, 2006. Afrikaans Tokenisation Sentence recognition Named-entity recognition Sentence Named entity Word Morphological analysis Natural language processing Computational linguistics TIMBL
99	Outomatiese Afrikaanse tekseenheididentifisering / deur Martin J. Puttkammer Puttkammer, Martin Johannes January 2006 (has links) An important core technology in the development of human language technology applications is an automatic morphological analyser. Such a morphological analyser consists of various modules, one of which is a tokeniser. At present no tokeniser exists for Afrikaans and it has therefore been impossible to develop a morphological analyser for Afrikaans. Thus, in this research project such a tokeniser is being developed, and the project therefore has two objectives: i)to postulate a tag set for integrated tokenisation, and ii) to develop an algorithm for integrated tokenisation. In order to achieve the first object, a tag set for the tagging of sentences, named-entities, words, abbreviations and punctuation is proposed specifically for the annotation of Afrikaans texts. It consists of 51 tags, which can be expanded in future in order to establish a larger, more specific tag set. The postulated tag set can also be simplified according to the level of specificity required by the user. It is subsequently shown that an effective tokeniser cannot be developed using only linguistic, or only statistical methods. This is due to the complexity of the task: rule-based modules should be used for certain processes (for example sentence recognition), while other processes (for example named-entity recognition) can only be executed successfully by means of a machine-learning module. It is argued that a hybrid system (a system where rule-based and statistical components are integrated) would achieve the best results on Afrikaans tokenisation. Various rule-based and statistical techniques, including a TiMBL-based classifier, are then employed to develop such a hybrid tokeniser for Afrikaans. The final tokeniser achieves an ∫-score of 97.25% when the complete set of tags is used. For sentence recognition an ∫-score of 100% is achieved. The tokeniser also recognises 81.39% of named entities. When a simplified tag set (consisting of only 12 tags) is used to annotate named entities, the ∫-score rises to 94.74%. The conclusion of the study is that a hybrid approach is indeed suitable for Afrikaans sentencisation, named-entity recognition and tokenisation. The tokeniser will improve if it is trained with more data, while the expansion of gazetteers as well as the tag set will also lead to a more accurate system / Thesis (M.A. (Applied Language and Literary Studies))--North-West University, Potchefstroom Campus, 2006. Afrikaans Tokenisation Sentence recognition Named-entity recognition Sentence Named entity Word Morphological analysis Natural language processing Computational linguistics TIMBL
100	Ocenění společnosti Hewlett-Packard s.r.o. / Business valuation of Hewlett-Packard s.r.o. Vonásková, Sylva January 2008 (has links) The thesis concentrates on a business valuation of Hewlett-Packard s.r.o., the Czech subsidiary of Hewlett-Packard Company, the world leading provider of IT goods and services. The thesis includes complex strategic analysis with detailed company inside and outside potential and market forecast. The financial analysis evaluates financial situation of the company, financial plan and a final valuation based on DCF entity method and market approach based on the price trend of publicly negotiable stocks of the parent company.

Search results