Global ETD Search

1	Comparison of Functional Dependency Extraction Methods and an Application of Depth First Search Sood, Kanika 29 September 2014 (has links) Extracting functional dependencies from existing databases is a useful technique in relational theory, database design and data mining. Functional dependencies are a key property of relational schema design. A functional dependency is a database constraint between two sets of attributes. In this study we present a comparative study over TANE, FUN, FD_Mine, FastFDs and Dep_Miner, and we propose a new technique, KlipFind, to extract dependencies from relations efficiently. KlipFind employs a depth-first, heuristic driven approach as a solution. Our study indicates that KlipFind is more space efficient than any of the existing solutions and highly efficient in finding keys for relations. Databases Extracting Functional dependencies KlipFind
2	A Two-level Prediction Model for Deep Reactive Ion Etch (DRIE) Taylor, Hayden K., Sun, Hongwei, Hill, Tyrone F., Schmidt, Martin A., Boning, Duane S. 01 1900 (has links) We contribute a quantitative and systematic model to capture etch non-uniformity in deep reactive ion etch of microelectromechanical systems (MEMS) devices. Deep reactive ion etch is commonly used in MEMS fabrication where high-aspect ratio features are to be produced in silicon. It is typical for many supposedly identical devices, perhaps of diameter 10 mm, to be etched simultaneously into one silicon wafer of diameter 150 mm. Etch non-uniformity depends on uneven distributions of ion and neutral species at the wafer level, and on local consumption of those species at the device, or die, level. An ion–neutral synergism model is constructed from data obtained from etching several layouts of differing pattern opening densities. Such a model is used to predict wafer-level variation with an r.m.s. error below 3%. This model is combined with a die-level model, which we have reported previously, on a MEMS layout. The two-level model is shown to enable prediction of both within-die and wafer-scale etch rate variation for arbitrary wafer loadings. / Singapore-MIT Alliance (SMA) DRIE pattern dependencies CAD modeling
3	Representing and Minimizing Multidimensional Dependencies Chakilam, Krishna Chaitanya 05 October 2009 (has links) No description available. Computer Science dependencies optimization subsumption
4	Enforcing Temporal and Ontological Dependencies Over Graphs Alipourlangouri, Morteza January 2022 (has links) Graphs provide powerful abstractions, and are widely used in different areas. There has been an increasing demand in using the graph data model to represent data in many applications such as network management, web page analysis, knowledge graphs, social networks. These graphs are usually dynamic and represent the time evolving relationships between entities. Enforcing and maintaining data quality in graphs is a critical task for decision making, operational efficiency and accurate data analysis as recent studies have shown that data scientists spend 60-80% of their time cleaning and organizing data [2]. This effort motivates the need for effective data cleaning tools to reduce the user burden. The study of data quality management focuses along a set of dimensions, including data consistency, data deduplication, information completeness, data currency, and data accuracy. Achieving all these data characteristics is often not possible in practice due to personnel costs, and for performance reasons. In this thesis, we focus on tackling three problems in two data quality dimensions: data consistency and data deduplication. To address the problem of data consistency over temporal graphs, we present a new class of data dependencies called Temporal Graph Functional Dependency (TGFDs). TGFDs generalize functional dependencies to temporal graphs as a sequence of graph snapshots that are induced by time intervals, and enforce both topological constraints and attribute value dependencies that must be satisfied by these snapshots. We establish the complexity results for the satisfiability and implication problems of TGFDs. We propose a sound and complete axiomatization system for TGFDs. We also present efficient parallel algorithms to detect inconsistencies in temporal graphs as violations of TGFDs. To address the data deduplication problem, we first address the problem of key discovery for graphs. Keys for graphs use topology and value constraints to uniquely identify entities in a graph database and keys are the main tools for data deduplication in graphs. We present two properties that define a key, including minimality and support and an algorithm to mine keys over graphs via frequent subgraph expansion. However, existing key constraints identify entities by enforcing label equality on node types. These constraints can be too restrictive to characterize structures and node labels that are syntactically different but semantically equivalent. Lastly, we propose a new class of key constraints, Ontological Graph Keys (OGKs) that extend conventional graph keys by ontological subgraph matching between entity labels and an external ontology. We study the entity matching problem with OGKs. We develop efficient algorithms to perform entity matching based on a Chase procedure. The proposed dependencies and algorithms in this thesis improve consistency detection in temporal graphs, automate the discovery of keys in graphs, and enrich the semantic expressiveness of graph keys. / Dissertation / Doctor of Science (PhD) Graphs Temporal graphs Keys Dependencies Temporal dependencies Graph dependencies Data cleaning
5	Extending dependencies for improving data quality Ma, Shuai January 2011 (has links) This doctoral thesis presents the results of my work on extending dependencies for improving data quality, both in a centralized environment with a single database and in a data exchange and integration environment with multiple databases. The first part of the thesis proposes five classes of data dependencies, referred to as CINDs, eCFDs, CFDcs, CFDps and CINDps, to capture data inconsistencies commonly found in practice in a centralized environment. For each class of these dependencies, we investigate two central problems: the satisfiability problem and the implication problem. The satisfiability problem is to determine given a set Σ of dependencies defined on a database schema R, whether or not there exists a nonempty database D of R that satisfies Σ. And the implication problem is to determine whether or not a set Σ of dependencies defined on a database schema R entails another dependency φ on R. That is, for each database D ofRthat satisfies Σ, the D must satisfy φ as well. These are important for the validation and optimization of data-cleaning processes. We establish complexity results of the satisfiability problem and the implication problem for all these five classes of dependencies, both in the absence of finite-domain attributes and in the general setting with finite-domain attributes. Moreover, SQL-based techniques are developed to detect data inconsistencies for each class of the proposed dependencies, which can be easily implemented on the top of current database management systems. The second part of the thesis studies three important topics for data cleaning in a data exchange and integration environment with multiple databases. One is the dependency propagation problem, which is to determine, given a view defined on data sources and a set of dependencies on the sources, whether another dependency is guaranteed to hold on the view. We investigate dependency propagation for views defined in various fragments of relational algebra, conditional functional dependencies (CFDs) [FGJK08] as view dependencies, and for source dependencies given as either CFDs or traditional functional dependencies (FDs). And we establish lower and upper bounds, all matching, ranging from PTIME to undecidable. These not only provide the first results for CFD propagation, but also extend the classical work of FD propagation by giving new complexity bounds in the presence of a setting with finite domains. We finally provide the first algorithm for computing a minimal cover of all CFDs propagated via SPC views. The algorithm has the same complexity as one of the most efficient algorithms for computing a cover of FDs propagated via a projection view, despite the increased expressive power of CFDs and SPC views. Another one is matching records from unreliable data sources. A class of matching dependencies (MDs) is introduced for specifying the semantics of unreliable data. As opposed to static constraints for schema design such as FDs, MDs are developed for record matching, and are defined in terms of similarity metrics and a dynamic semantics. We identify a special case of MDs, referred to as relative candidate keys (RCKs), to determine what attributes to compare and how to compare them when matching records across possibly different relations. We also propose a mechanism for inferring MDs with a sound and complete system, a departure from traditional implication analysis, such that when we cannot match records by comparing attributes that contain errors, we may still find matches by using other, more reliable attributes. We finally provide a quadratic time algorithm for inferring MDs, and an effective algorithm for deducing quality RCKs from a given set of MDs. The last one is finding certain fixes for data monitoring [CGGM03, SMO07], which is to find and correct errors in a tuple when it is created, either entered manually or generated by some process. That is, we want to ensure that a tuple t is clean before it is used, to prevent errors introduced by adding t. As noted by [SMO07], it is far less costly to correct a tuple at the point of entry than fixing it afterward. Data repairing based on integrity constraints may not find certain fixes that are absolutely correct, and worse, may introduce new errors when repairing the data. We propose a method for finding certain fixes, based on master data, a notion of certain regions, and a class of editing rules. A certain region is a set of attributes that are assured correct by the users. Given a certain region and master data, editing rules tell us what attributes to fix and how to update them. We show how the method can be used in data monitoring and enrichment. We develop techniques for reasoning about editing rules, to decide whether they lead to a unique fix and whether they are able to fix all the attributes in a tuple, relative to master data and a certain region. We also provide an algorithm to identify minimal certain regions, such that a certain fix is warranted by editing rules and master data as long as one of the regions is correct. 005.3
6	Syntax-mediated semantic parsing Reddy Goli, Venkata Sivakumar January 2017 (has links) Querying a database to retrieve an answer, telling a robot to perform an action, or teaching a computer to play a game are tasks requiring communication with machines in a language interpretable by them. Semantic parsing is the task of converting human language to a machine interpretable language. While human languages are sequential in nature with latent structures, machine interpretable languages are formal with explicit structures. The computational linguistics community have created several treebanks to understand the formal syntactic structures of human languages. In this thesis, we use these to obtain formal meaning representations of languages, and learn computational models to convert these meaning representations to the target machine representation. Our goal is to evaluate if existing treebank syntactic representations are useful for semantic parsing. Existing semantic parsing methods mainly learn domain-specific grammars which can parse human languages to machine representation directly. We deviate from this trend and make use of general-purpose syntactic grammar to help in semantic parsing. We use two syntactic representations: Combinatory Categorial Grammar (CCG) and dependency syntax. CCG has a well established theory on deriving meaning representations from its syntactic derivations. But there are no CCG treebanks for many languages since these are difficult to annotate. In contrast, dependencies are easy to annotate and have many treebanks. However, dependencies do not have a well established theory for deriving meaning representations. In this thesis, we propose novel theories for deriving meaning representations from dependencies. Our evaluation task is question answering on a knowledge base. Given a question, our goal is to answer it on the knowledge base by converting the question to an executable query. We use Freebase, the knowledge source behind Google’s search engine, as our knowledge base. Freebase contains millions of real world facts represented in a graphical format. Inspired from the Freebase structure, we formulate semantic parsing as a graph matching problem, i.e., given a natural language sentence, we convert it into a graph structure from the meaning representation obtained from syntax, and find the subgraph of Freebase that best matches the natural language graph. Our experiments on Free917, WebQuestions and GraphQuestions semantic parsing datasets conclude that general-purpose syntax is more useful for semantic parsing than induced task-specific syntax and syntax-agnostic representations.
7	DIM : A systematic and lightweight method for identifying dependencies between requirements Gomez, Arturo, Rueda, Gema January 2010 (has links) Dependencies between requirements are a crucial factor for any software development since they impact many project areas. Nevertheless, their identification remains a challenge. Some methods have been proposed but none of them are really applicable to real projects due to their high cost or low accuracy. DIM is a lightweight method for identifying dependencies proposed on a previous paper. This paper presents an experiment comparing the sets of dependencies found by DIM and a method based on pair-wise comparison. The experiment was executed using a requirement specification for an open source project. These requirements were extracted by reverse engineering. Our results have provided evidence confirming that DIM finds more dependencies and its results (the dependencies identified) do not depend on the profile of the practitioner applying it. Another important result is that DIM requires fewer resources when applied, since it does not rely on pair-wise comparisons and it can be easily automated. / Avda. Espana 101 P6 Bj-E 28341, Madrid, Spain. Telephone number: +34627770492 Requirement Interdependencies Dependencies Identification Method Reverse Engineering Empirical Research User Stories Dependencies Agile Development Dependencies Identification Method Non Commutative Implementation Cost Software Engineering Programvaruteknik
8	Reliability Assessment for Cloud Applications Wang, Xiaowei 11 January 2017 (has links) No description available. 510 Cloud Applications Reliability Assessment Dependencies Informatik (PPN619939052)
9	Small-molecule probes to explore cancer Schaefer, Giannina Ines 04 June 2015 (has links) Small molecules play important roles in therapeutics and drug discovery. Significant progress has been made by the chemical biology community to discover small-molecule probes to explore biological processes and to treat disease. This thesis describes both the discovery of novel probes for the Hedgehog (Hh) pathway and the application of small molecules in identifying cancer dependencies. / Chemistry and Chemical Biology Chemistry Biochemistry cancer cancer dependencies hedgehog signaling small-molecule probes
10	Reference Coupling: A Method for Identifying Software Ecosystems of Technically Dependent Projects Harrison, Francis 22 December 2015 (has links) Software projects are not developed in isolation. Open source software projects encourage a networked collaboration and interdependence across projects and developers. Recent research has shifted to studying software ecosystems, communities of projects that depend on each other and are developed together. However, identifying technical dependencies at the ecosystem level can be challenging. In this dissertation, we propose a new method, known as reference coupling, for detecting technical dependencies between projects. The method establishes dependencies through user-specified cross-references between projects. We use our method to identify ecosystems in GitHub hosted projects, and we identify several characteristics of the identified ecosystems. Our findings show that most ecosystems are centered around one project and are interconnected with other ecosystems. The predominant type of ecosystems are those that develop tools to support software development. We also found that the project owners’ social behavior aligns well with the technical dependencies within the ecosystem, but project contributors’ social behavior does not align with these dependencies. We conclude with a discussion on future research that is enabled by our reference coupling method. / Graduate / harrison.franc@gmail.com Reference coupling Socio-technical congruence Software ecosystems Logical dependencies

Search results