Global ETD Search

1	Reconstructing Signaling Pathways Using Regular-Language Constrained Paths Wagner, Mitchell James 18 September 2018 (has links) Signaling pathways are widely studied in systems biology. Several databases catalog our knowledge of these pathways, including the proteins and interactions that comprise them. However, high-quality curation of this information is slow and painstaking. As a result, many interactions still lack annotation concerning the pathways they participate in. A natural question that arises is whether or not it is possible to automatically leverage existing annotations to identify new interactions for inclusion in a given pathway. Here, we present RegLinker, an algorithm that achieves this purpose by computing multiple short paths from pathway receptors to transcription factors (TFs) within a background interaction network. The key idea underlying RegLinker is the use of regular-language constraints to control the number of non-pathway edges present in the computed paths. We systematically evaluate RegLinker and alternative approaches against a comprehensive set of 15 signaling pathways and demonstrate that RegLinker exhibits superior recovery of withheld pathway proteins and interactions. These results show the promise of our approach for prioritizing candidates for experimental study and the broader potential of automated analysis to attenuate difficulties of traditional manual inquiry. / Master of Science / Cells in the human body are constantly receiving signals that inform their response to a variety of conditions. These signals serve as cues to a cell, allowing it to make informed decisions that impact cellular processes such as movement, growth, and death. Cells employ proteins and the interactions between them to achieve these capabilities. Signals manifest as molecules that interact with proteins bound to membrane of a cell. When this happens, a cascade of interactions between the proteins inside the cell will be set off. Ultimately, this cascade activate or inhibit the cell’s production of new proteins, constituting a response to the signal received. The proteins and interactions involved in such a cascade together form what is known as a signaling pathway. Experiments have uncovered the interactions that are present in many signaling pathways, and researchers have carefully cataloged this information in publicly available databases. However, high-quality curation is slow and painstaking, and many known interactions have not been annotated as belonging to any pathway. A natural question that arises is whether or not it is possible to leverage existing annotations to automatically determine which new interactions to include in a given pathway. In this thesis, we present an efficient algorithm, RegLinker, for this purpose. We evaluate this method and alternative approaches on a comprehensive set of 15 signaling pathways and demonstrate that RegLinker is better at recovering interactions withheld from these pathways. In particular, we show RegLinker’s superior ability to identify interactions that utilize proteins that were not previously considered part of a pathway. These results underscore the promise of our approach for prioritizing candidates for experimental study and the broader potential of automated analysis to attenuate difficulties of traditional manual inquiry. Regular Languages Shortest Paths Signaling Networks
2	Learning of Timed Systems Grinchtein, Olga January 2008 (has links) <p>Regular inference is a research direction in machine learning. The goal of regular inference is to construct a representation of a regular language in the form of deterministic finite automaton (DFA) based on the set of positive and negative examples. DFAs take strings of symbols (words) as input, and produce a binary classification as output, indicating whether the word belongs to the language or not. There are two types of learning algorithms for DFAs: passive and active learning algorithms. In passive learning, the set of positive and negative examples is given and not chosen by inference algorithm. In contrast, in active learning, the learning algorithm chooses examples from which a model is constructed.</p><p>Active learning was introduced in 1987 by Dana Angluin. She presented the L* algorithm for learning DFAs by asking membership and equivalence queries to a teacher who knows the regular language accepted by DFA to be learned. A membership query checks whether a word belongs to the language or not. An equivalence query checks whether a hypothesized model is equivalent to the DFA to be learned.The L* algorithm has been found to be useful in different areas, including black box checking, compositional verification and integration testing. There are also other algorithms similar to L* for regular inference. However, the learning of timed systems has not been studied before. This thesis presents algorithms for learning timed systems in an active learning framework.</p><p>As a model of timed system we choose event-recording automata (ERAs), a determinizable subclass of the widely used timed automata. The advantages of ERA in comparison with timed automata, is that it is known priori the set of clocks of an ERA and when clocks are reset. The contribution of this thesis is four algorithms for learning deterministic event-recording automaton (DERA). Two algorithms learn a subclass of DERA, called event-deterministic ERA (EDERA) and two algorithms learn general DERA.</p><p>The problem with DERAs that they do not have canonical form. Therefore we focus on subclass of DERAs that have canonical representation, EDERA, and apply the L* algorithm to learn EDERAs. The L* algorithm in timed setting requires a procedure that learns clock guards of DERAs. This approach constructs EDERAs which are exponentially bigger than automaton to be learned. Another procedure can be used to lean smaller EDERAs, but it requires to solve NP-hard problem.</p><p>We also use the L* algorithm to learn general DERA. One drawback of this approach that inferred DERAs have a form of region graph and there is blow-up in the number of transitions. Therefore we introduce an algorithm for learning DERA which uses a new data structure for organising results of queries, called a timed decision tree, and avoids region graph construction. Theoretically this algorithm can construct bigger DERA than the L* algorithm, but in the average case we expect better performance.</p> Datavetenskap learning regular languages timed systems event-recording automata Datavetenskap
3	Monoids and the State Complexity of the Operation root(<i>L</i>) Krawetz, Bryan January 2004 (has links) In this thesis, we cover the general topic of state complexity. In particular, we examine the bounds on the state complexity of some different representations of regular languages. As well, we consider the state complexity of the operation root(<i>L</i>). We give quick treatment of the deterministic state complexity bounds for nondeterministic finite automata and regular expressions. This includes an improvement on the worst-case lower bound for a regular expression, relative to its alphabetic length. The focus of this thesis is the study of the increase in state complexity of a regular language <i>L</i> under the operation root(<i>L</i>). This operation requires us to examine the connections between abstract algebra and formal languages. We present results, some original to this thesis, concerning the size of the largest monoid generated by two elements. Also, we give good bounds on the worst-case state complexity of root(<i>L</i>). In turn, these new results concerning root(<i>L</i>) allow us to improve previous bounds given for the state complexity of two-way deterministic finite automata. Computer Science state complexity monoids formal languages regular languages
4	Monoids and the State Complexity of the Operation root(<i>L</i>) Krawetz, Bryan January 2004 (has links) In this thesis, we cover the general topic of state complexity. In particular, we examine the bounds on the state complexity of some different representations of regular languages. As well, we consider the state complexity of the operation root(<i>L</i>). We give quick treatment of the deterministic state complexity bounds for nondeterministic finite automata and regular expressions. This includes an improvement on the worst-case lower bound for a regular expression, relative to its alphabetic length. The focus of this thesis is the study of the increase in state complexity of a regular language <i>L</i> under the operation root(<i>L</i>). This operation requires us to examine the connections between abstract algebra and formal languages. We present results, some original to this thesis, concerning the size of the largest monoid generated by two elements. Also, we give good bounds on the worst-case state complexity of root(<i>L</i>). In turn, these new results concerning root(<i>L</i>) allow us to improve previous bounds given for the state complexity of two-way deterministic finite automata. Computer Science state complexity monoids formal languages regular languages
5	Learning of Timed Systems Grinchtein, Olga January 2008 (has links) Regular inference is a research direction in machine learning. The goal of regular inference is to construct a representation of a regular language in the form of deterministic finite automaton (DFA) based on the set of positive and negative examples. DFAs take strings of symbols (words) as input, and produce a binary classification as output, indicating whether the word belongs to the language or not. There are two types of learning algorithms for DFAs: passive and active learning algorithms. In passive learning, the set of positive and negative examples is given and not chosen by inference algorithm. In contrast, in active learning, the learning algorithm chooses examples from which a model is constructed. Active learning was introduced in 1987 by Dana Angluin. She presented the L* algorithm for learning DFAs by asking membership and equivalence queries to a teacher who knows the regular language accepted by DFA to be learned. A membership query checks whether a word belongs to the language or not. An equivalence query checks whether a hypothesized model is equivalent to the DFA to be learned.The L* algorithm has been found to be useful in different areas, including black box checking, compositional verification and integration testing. There are also other algorithms similar to L* for regular inference. However, the learning of timed systems has not been studied before. This thesis presents algorithms for learning timed systems in an active learning framework. As a model of timed system we choose event-recording automata (ERAs), a determinizable subclass of the widely used timed automata. The advantages of ERA in comparison with timed automata, is that it is known priori the set of clocks of an ERA and when clocks are reset. The contribution of this thesis is four algorithms for learning deterministic event-recording automaton (DERA). Two algorithms learn a subclass of DERA, called event-deterministic ERA (EDERA) and two algorithms learn general DERA. The problem with DERAs that they do not have canonical form. Therefore we focus on subclass of DERAs that have canonical representation, EDERA, and apply the L* algorithm to learn EDERAs. The L* algorithm in timed setting requires a procedure that learns clock guards of DERAs. This approach constructs EDERAs which are exponentially bigger than automaton to be learned. Another procedure can be used to lean smaller EDERAs, but it requires to solve NP-hard problem. We also use the L* algorithm to learn general DERA. One drawback of this approach that inferred DERAs have a form of region graph and there is blow-up in the number of transitions. Therefore we introduce an algorithm for learning DERA which uses a new data structure for organising results of queries, called a timed decision tree, and avoids region graph construction. Theoretically this algorithm can construct bigger DERA than the L* algorithm, but in the average case we expect better performance. learning regular languages timed systems event-recording automata
6	Graph automatic semigroups Carey, Rachael Marie January 2016 (has links) In this thesis we examine properties and constructions of graph automatic semigroups, a generalisation of both automatic semigroups and finitely generated FA-presentable semigroups. We consider the properties of graph automatic semigroups, showing that they are independent of the choice of generating set, have decidable word problem, and that if we have a graph automatic structure for a semigroup then we can find one with uniqueness. Semigroup constructions and their effect on graph automaticity are considered. We show that finitely generated direct products, free products, finitely generated Rees matrix semigroup constructions, zero unions, and ordinal sums all preserve unary graph automaticity, and examine when the converse also holds. We also demonstrate situations where semidirect products, Bruck-Reilly extensions, and semilattice constructions preserve graph automaticity, and consider the conditions we may impose on such constructions in order to ensure that graph automaticity is preserved. Unary graph automatic semigroups, that is semigroups which have graph automatic structures over a single letter alphabet, are also examined. We consider the form of an automaton recognising multiplication by generators in such a semigroup, and use this to demonstrate various properties of unary graph automatic semigroups. We show that infinite periodic semigroups are not unary graph automatic, and show that we may always find a uniform set of normal forms for a unary graph automatic semigroup. We also determine some necessary conditions for a semigroup to be unary graph automatic, and use this to provide examples of semigroups which are not unary graph automatic. Finally we consider semigroup constructions for unary graph automatic semigroups. We show that the free product of two semigroups is unary graph automatic if and only if both semigroups are trivial; that direct products do not always preserve unary graph automaticity; and that Bruck-Reilly extensions are never unary graph automatic. 512
7	Toward More Composable Software-Security Policies: Tools and Techniques Lomsak, Daniel 01 January 2013 (has links) Complex software-security policies are dicult to specify, understand, and update. The same is true for complex software in general, but while many tools and techniques exist for decomposing complex general software into simpler reusable modules (packages, classes, functions, aspects, etc.), few tools exist for decomposing complex security policies into simpler reusable modules. The tools that do exist for modularizing policies either encapsulate entire policies as atomic modules that cannot be decomposed or allow ne-grained policy modularization but require expertise to use correctly. This dissertation presents a policy-composition tool called PoliSeer [27, 26] and the PoCo policy-composition software-security language. PoliSeer is a GUI-based tool designed to enable users who are not expert policy engineers to exibly specify, visualize, modify, and enforce complex runtime policies on untrusted software. PoliSeer users rely on expert policy engineers to specify universally composable policy modules; PoliSeer users then build complex policies by composing those expert-written modules. This dissertation describes the design and implementation of PoliSeer and a case study in which we have used PoliSeer to specify and enforce a policy on PoliSeer itself. PoCo is a language for specifying composable software-security policies. PoCo users specify software-security policies in terms of abstract input-output event sequences. The policy outputs are expressive, capable of describing all desired, irrelevant, and prohibited events at once. These descriptive outputs compose well: operations for combining them satisfy a large number of algebraic properties, which allows policy hierarchies to be designed more simply and naturally. We demonstrate PoCo's capability via a case study in which a sophisticated policy is implemented in PoCo. Policy Composition Policy-Specification Languages Signed Regular Languages Software Engineering Visual Specification Computer Sciences
8	Shortcut Transformers and the Learnability of Automata Martens, Willeke January 2023 (has links) Transformers have emerged as a powerful architecture for various tasks in natural language processing, computer vision, and multi-modal domains. Despite their success, understanding the computational capabilities and limitations of transformers remains a challenge. This work focuses on relating transformers to deterministic finite automata (DFAs) and empirically investigates the architecture's ability to simulate DFAs of varying complexities. We empirically explore the simulation of DFAs by transformers, specifically focusing on the solvable A4-DFA and the non-solvable A5-DFA. We conduct experiments to evaluate the in-distribution and out-of-distribution accuracy of sub-linear depth transformers with positive results on both accounts. Additionally, we examine the impact of widening the transformer to find even shallower transformers for the A4-DFA. While no significant improvements are observed compared to the sub-linear depth transformers, further exploration of hyperparameters is needed for more reliable results. transformers modern neural networks automata expressivity learnability regular languages solvable groups Computer Sciences Datavetenskap (datalogi)
9	Applying DNA Self-assembly in Formal Language Theory Akkara, Pinto 14 October 2013 (has links) No description available. Computer Science DNA Self-assembly Formal Languages Regular Languages Context-Free Languages
10	Problems Related to Shortest Strings in Formal Languages Ang, Thomas January 2010 (has links) In formal language theory, studying shortest strings in languages, and variations thereof, can be useful since these strings can serve as small witnesses for properties of the languages, and can also provide bounds for other problems involving languages. For example, the length of the shortest string accepted by a regular language provides a lower bound on the state complexity of the language. In Chapter 1, we introduce some relevant concepts and notation used in automata and language theory, and we show some basic results concerning the connection between the length of the shortest string and the nondeterministic state complexity of a regular language. Chapter 2 examines the effect of the intersection operation on the length of the shortest string in regular languages. A tight worst-case bound is given for the length of the shortest string in the intersection of two regular languages, and loose bounds are given for two variations on the problem. Chapter 3 discusses languages that are defined over a free group instead of a free monoid. We study the length of the shortest string in a regular language that becomes the empty string in the free group, and a variety of bounds are given for different cases. Chapter 4 mentions open problems and some interesting observations that were made while studying two of the problems: finding good bounds on the length of the shortest squarefree string accepted by a deterministic finite automaton, and finding an efficient way to check if a finite set of finite words generates the free monoid. Some of the results in this thesis have appeared in work that the author has participated in \cite{AngPigRamSha,AngShallit}. automata theory algorithms and complexity finite state machine regular languages regular expressions formal languages theory of computation Computer Science

Search results