21 |
On Generalizations of Gowers NormsHatami, Hamed 01 March 2010 (has links)
Inspired by the definition of Gowers norms we study integrals of products of multi-variate functions. The $L_p$ norms, certain trace norms, and the Gowers norms are all defined by taking the proper root of one of these integrals. These integrals are important from a combinatorial point of view as inequalities between them are useful
in understanding the relation between various subgraph densities.
Lov\'asz asked the following questions: (1) Which integrals correspond to norm functions? (2) What are the common properties of the corresponding normed spaces? We address these two questions.
We show that such a formula is a norm if and only if it satisfies a H\"older type inequality. This condition turns out to be very useful: First we apply it to prove various necessary conditions on the structure of the integrals which correspond to norm functions.
We also apply the condition to an important conjecture of Erd\H{o}s, Simonovits, and Sidorenko. Roughly speaking, the conjecture says that among all graphs with the same edge density, random graphs contain the least number of copies of every bipartite graph. This had been verified previously for trees, the $3$-dimensional cube, and a few other families of bipartite graphs. The special case of the conjecture for paths, one of the simplest families of bipartite
graphs, is equivalent to the Blakley-Roy theorem in linear algebra.
Our results verify the conjecture for certain graphs including all hypercubes, one of the important classes of bipartite graphs, and thus generalize a result of Erd\H{o}s and Simonovits. In fact, for hypercubes we can prove statements that are surprisingly stronger than the assertion of the conjecture.
To address the second question of Lov\'asz we study these normed spaces from a geometric point of view, and determine their moduli of smoothness and convexity. These two parameters are among the most important invariants in Banach space theory. Our result in particular determines the moduli of smoothness and convexity of Gowers norms. In some cases we are able to prove the Hanner
inequality, one of the strongest inequalities related to the concept
of smoothness and convexity. We also prove a complex interpolation theorem for these normed spaces, and use this and the Hanner
inequality to obtain various optimum results in terms of the constants involved in the definition of moduli of smoothness and
convexity.
|
22 |
Queries, Data, and Statistics: Pick TwoMishra, Chaitanya 21 April 2010 (has links)
The query processor of a relational database system executes declarative queries on relational data using query evaluation plans. The cost of the query evaluation plan depends on various statistics defined by the query and data. These statistics include intermediate and base table sizes, and data
distributions on columns. In addition to being an important factor in query optimization, such statistics also influence various runtime properties of the query evaluation plan.
This thesis explores the interactions between queries, data, and statistics in the query processor of a relational database system. Specifically, we consider problems where any two of the three - queries, data, and statistics - are provided, with the
objective of instantiating the missing element in the triple such that the query, when executed on the data, satisfies the statistics on the associated subexpressions. We present multiple query processing problems that can be abstractly formulated in this manner.
The first contribution of this thesis is a monitoring framework for collecting and estimating statistics during query execution. We apply this framework to the problems of
monitoring the progress of query execution, and adaptively reoptimizing query execution plans. Our monitoring and adaptivity framework has a low overhead, while significantly reducing query execution times. This work demonstrates the feasibility and utility of overlaying statistics estimators on query evaluation plans.
Our next contribution is a framework for testing the performance of a query
processor by generating targeted test queries and databases. We present techniques for data-aware query generation, and query-aware data generation that satisfy test cases specifying statistical constraints. We formally analyze the hardness of the problems considered, and present systems that support best-effort semantics for targeted query and data generation.
The final contribution of this thesis is a set of techniques for designing queries for business intelligence applications that specify cardinality constraints on the result. We present an interactive query refinement framework that explicitly incorporates user feedback into query design, refining queries returning too many or few answers.
Each of these contributions is accompanied by a formal analysis of the problem, and a detailed experimental evaluation of an associated system.
|
23 |
Constructions, Lower Bounds, and New Directions in Cryptography and Computational ComplexityPapakonstantinou, Periklis 01 September 2010 (has links)
In the first part of the thesis we show black-box separations in public and private-key cryptography. Our main result answers in the negative the question of whether we can base Identity Based Encryption (IBE) on Trapdoor Permutations. Furthermore, we make progress towards the black-box separation of IBE from the Decisional Diffie-Hellman assumption. We also show the necessity of adaptivity when querying one-way permutations to construct pseudorandom generators a' la Goldreich-Levin; an issue related to streaming models for cryptography.
In the second part we introduce streaming techniques in understanding randomness in efficient computation, proving lower bounds for efficiently computable problems, and in computing cryptographic primitives.
We observe [Coo71] that logarithmic space-bounded Turing Machines, equipped with an unbounded stack, henceforth called Stack Machines, together with an external random tape of polynomial length characterize RP,BPP an so on. By parametrizing on the number of passes over the random tape
we provide a technical perspective bringing together Streaming, Derandomization, and older works in Stack Machines. Our technical developments relate this new model with previous works in derandomization. For example, we show that to derandomize parts of BPP it is in some sense sufficient to derandomize
BPNC (a class believed to be much lower than P \subseteq BPP). We also obtain a number
of results for variants of the main model, regarding e.g. the fooling power of Nisan's pseudorandom generator (PRG) [N92]
for the derandomization of BPNC^1, and the
relation of parametrized access to NP-witnesses with width-parametrizations of SAT.
A substantial contribution regards a streaming approach to lower bounds
for problems in the NC-hierarchy (and above).
We apply Communication Complexity to show
a streaming lower bound for a model with an unbounded (free-to-access) pushdown storage.
In particular, we obtain a $n^{\Omega(1)}$ lower bound simultaneously in the space and in the number of passes over the input, for a variant of inner product. This is the first lower bound for machines that correspond to poly-size circuits, can do Parity, Barrington's language, and decide problems in P-NC assuming EXP \neq PSPACE.
Finally, we initiate the study of log-space streaming computation of cryptographic primitives. We observe that the work on Cryptography in NC^0 [AIK08] yields
a non-black-box construction of a one-way function computable in an O(log n)-space bounded streaming model.Also, we show that relying on this work is in some sense necessary.
|
24 |
Summarizing Spoken Documents Through Utterance SelectionZhu, Xiaodan 02 September 2010 (has links)
The inherently linear and sequential property of speech raises the need for
ways to better navigate through spoken documents. The strategy of navigation I
focus on in this thesis is summarization, which aims to identify important excerpts
in spoken documents.
A basic characteristic that distinguishes speech summarization from traditional
text summarization is the availability and utilization of speech-related features.
Most previous research, however, has addressed this source from the perspective of
descriptive linguistics, in considering only such prosodic features that appear in that
literature. The experiments in this dissertation suggest that incorporating prosody
does help but its usefulness is very limited—much less than has been suggested in
some previous research. We reassess the role of prosodic features vs. features arising
from speech recognition transcripts, as well as baseline selection in error-prone
and disfluency-filled spontaneous speech. These problems interact with each other,
and isolated observations have hampered a comprehensive understanding to date.
The effectiveness of these prosodic features is largely confined because of their
difficulty in predicting content relevance and redundancy. Nevertheless, untranscribed
audio does contain more information than just prosody. This dissertation
shows that collecting statistics from far more complex acoustic patterns does allow
for estimating state-of-the-art summarization models directly. To this end, we propose
an acoustics-based summarization model that is estimated directly on acoustic
patterns. We empirically determine the extent to which this acoustics-based model
can effectively replace ASR-based models.
The extent to which written sources can benefit speech summarization has
also been limited, namely to noisy speech recognition transcripts. Predicting the
salience of utterances can indeed benefit from more sources than raw audio only.
Since speaking and writing are two basic ways of communication and are by nature
closely related to each other, in many situations, speech is accompanied with relevant
written text. Richer semantics conveyed in the relevant written text provides
additional information over speech by itself. This thesis utilizes such information
in content selection to help identify salient utterances in the corresponding speech
documents. We also employ such richer content to find the structure of spoken
documents—i.e., subtopic boundaries—which may in turn help summarization.
|
25 |
Decomposition and Symmetry in Constraint Optimization ProblemsKitching, Matthew 14 November 2011 (has links)
This thesis presents several techniques that advance search-based algorithms for solving
Constraint Optimization Problems (COPs). These techniques exploit structural features
common in such problems. In particular, the thesis presents a number of innovative algorithms, and associated data structures, designed to exploit decomposition and symmetry in COPs.
First, a new technique called component templating is introduced. Component templates are data structures for efficiently representing the disjoint sub-problems that are encountered during search. Information about each disjoint sub-problem can then be reused during search, increasing efficiency.
A new algorithm called OR-decomposition is introduced. This algorithm obtains many of the computational benefits of decomposition without the need to resort to separate recursions. That is, the algorithm explores a standard OR tree rather than an AND-OR tree. In this way, the search algorithm gains greater freedom in its variable ordering compared to previous decomposition algorithms.
Although decomposition algorithms such as OR-decomposition are effective techniques for solving COPs with low tree-width, existing decomposition algorithms offer
little advantage over branch and bound search on problems with high tree-width. A new method for exploiting decomposition on problems with high tree-width is presented. This technique involves detecting and exploiting decompositions on a selected subset of the problem’s objectives. Such decompositions can then be used to more efficiently compute additional bounds that can be used by branch and bound search.
The second half of the thesis explores the use of symmetries in COPs. Using component templates, it is possible to exploit dynamic symmetries that appear during search when some of the variables of a problem have been assigned a value. Symmetries have not previously been combined with decomposition in COPs.
An algorithm called Set Branching is presented, which exploits almost-symmetries in the values of a variable by clustering similar values together, then branching on sets of values rather than on each single value.
The decomposition and symmetry algorithms presented in this thesis increase the
efficiency of constraint optimization solvers. The thesis also presents experimental results that test these algorithms on a variety of real world problems, and demonstrate performance improvements over current state-of-the-art techniques.
|
26 |
Using Modeler Intent in Software EngineeringSalay, Richard 17 February 2011 (has links)
Models are used widely within software engineering and have been studied from many perspectives. A perspective that has received little attention is the role of modeler intent in modeling. Knowing the intent of the modeler supports both model comprehension by providing the correct context for interpreting the model and model quality by clearly defining what information the model must contain. Furthermore, formal expressions of this intent allow automated support for this. Despite the value that the knowledge of modeler intent can provide, there are no adequate means in the current state of modeling practice for expressing this information. The focus of this thesis is to address this gap by providing mechanisms for expressing modeler intent both explicitly and formally.
We approach this problem by recognizing the existence of a role level in modeling where the role each model plays defines what information it should contain and how this is related to the information in other models. The specification of these roles is what we refer to as the expression of modeler intent. We then present a framework that incorporates four aspects of modeler intent at the role level: the existential intent for a model that arises in response to the need for a set of information by stakeholders, the content criteria that express what information the model is intended to contain, model relationships that express how models are intended to constrain one another and the decomposition criteria that express the intent behind how a model is decomposed into a collection of models. A key contribution of this thesis is the specification of the macromodeling language as a new modeling language designed for the role level that supports the expression of all four aspects of modeler intent. We evaluate these techniques by applying them to two real-world modeling examples.
|
27 |
Considering Mobile Devices, Context Awareness, and Mobile UsersSu, Jing Chih 17 February 2011 (has links)
Recent years have seen rapid growth and adoption of powerful mobile devices such as smartphones, equipped with sophisticated input and display systems, and multiple communication technologies. This trend has coincided with the rapid deployment and adoption of high-speed Internet services and web-based applications. While this rapid development of mobile technology has provided great opportunities, it also presents significant new challenges compared to traditional desktop computing. Specifically, unlike the traditional desktop computing experience where users are stationary and physically isolated, users in mobile and social settings can be faced with real time demands for their attention.
This thesis examines the relationship between mobile devices, context awareness, and mobile users. We propose the use of physical proximity context to adapt and improve system behavior, and enable mobile users to more effectively access and share content in non-desktop settings. This work identifies three distinct challenges in mobile software, and addresses these challenges using physical proximity context awareness. First we address improvements to mobile node network utilization by using proximity awareness to automatically manage local radio resources. Next we address improvements to mobile web-backed applications and services by enabling social proximity awareness. Finally, we enable greater mobility and physical awareness for visually impaired users on mobile devices by providing an interface which enables exploration of spatial geometric layouts.
|
28 |
Cryptography: Leakage Resilience, Black Box Separations, and Credential-free Key ExchangeVahlis, Evgene 17 February 2011 (has links)
We study several basic problems in cryptography: Leakage resilient cryptography: cryptographic schemes are often broken through side-channel attacks on the devices that run them. Such attacks typically involve an adversary that is within short distance from the device, and is able to measure various physical characteristics of the device such as power consumption, timing, heat, and sound emanation. We show how to immunize any cryptographic functionality against arbitrary side-channel attacks using the recently achieved fully homomorphic encryption, and a single piece of secure hardware that samples from a public distribution. Our secure hardware never touches any secret information (such as a private key) and is testable in the sense that its inputs are not influenced by user or adversarial inputs.
Credential-free key exchange and sessions: One of the most basic tasks in cryptography is to allow two parties
that are connected by a completely insecure channel to communicate securely. Typically, the first step towards achieving this is an exchange of a session key. Such an exchange normally requires an infrastructure, where, for example, public keys of users are stored, and can be securely retrieved. However, often such an infrastructure does not exist, or is too costly to maintain. In such a setting an adversary can always be the Man-In-The-Middle and intercept all communications. However, we argue that a meaningful level of security can still be achieved. We present a definition of secure key exchange in a setting without any infrastructure, and describe a protocol that achieves that type of security. The idea is that an adversary should either know nothing about the session key produced by the protocol, or be forced to participate in two independent instances of the protocol
Black-box separations: A complementary aspect of cryptographic research is the study of the limits of cryptographic assumptions. Basing constructions on weaker assumptions gives us more confidence in their security. We therefore wish to find, for each standard cryptographic assumption, what tasks cannot be solved based solely on that assumption. In this thesis we study the limits of a very basic public key primitive: trapdoor permutations (TDPs). We show that TDPs cannot be used to construct Identity Based Encryption or a stronger type of TDPs called correlation secure TDPs. Correlation secure TDPs have been used to build chosen-ciphertext secure public key encryption scheme -- a primitive with a wide
range of theoretical and practical applications.
|
29 |
Expressive Motion Editing Using Motion ExtremaColeman, Patrick 21 August 2012 (has links)
When animating characters, a key goal is the creation of a believable, expressive performance that gives a character a unique personality with a distinct style of movement. While animators are skilled at creating expressive, personalized performances, it remains challenging to change performance-related aspects of movement in existing motion data. In recent years, motion data reuse has become increasingly important as recorded motion capture data has come into widespread use. This thesis investigates the use of a sparse set of pose-centric editing controls for editing existing motion data using techniques similar to those used by keyframe animators when they create new motion. To do this, this thesis proposes the use of motion extrema--the poses a character passes through when there is a significant change in movement--as a means for choosing effective pose-centric editing controls.
First, I present algorithms for identifying motion extrema. Motion extrema can be associated with individual joints or the full body of the character. I introduce a set of approaches for identifying motion extrema; these include the use of extrema of differential measures and the explicit search for times at which the body or a joint is in a spatially extreme configuration.
I then present three motion editing applications that use motion extrema as a foundation for applying motion edits. The first application, pose-centric editing, allows users to interactively change poses in a motion, and the system modifies the motion to respect existing ground contact. The second application--staggered poses, introduces a model of character pose that explicitly encodes how timing varies among motion extrema on different parts of the body. This timing variation is commonly used by animators to model overlapping action. By introducing an algorithm for finding timing variation on motion extrema in existing motion, this system enables users to make high-level changes to timing patterns to change overlap effects in existing motion.
Finally, I present a procedural motion editing application that targets a specific aspect of motion style; this technique is called spatial exaggeration. Spatial exaggeration changes the geometric relationships among extreme poses. Such edits cause movement to appear more or less energetic. Overall, these applications demonstrate that performance-related aspects of existing motion can be edited using a sparse set of controls in the form of motion extrema.
|
30 |
Polymorphism and Genome AssemblyDonmez, Nilgun 11 December 2012 (has links)
When Darwin introduced natural selection in 1859 as a key mechanism of evolution, little was known about the underlying cause of variation within a species. Today we know that this variation is caused by the acquired genomic differences between individuals. Polymorphism, defined as the existence of multiple alleles or forms at a genomic locus, is the technical term used for such genetic variations.
Polymorphism, along with reproduction and inheritance of genetic traits, is a necessary condition for natural selection and is crucial in understanding how species evolve and adapt. Many questions regarding polymorphism, such as why certain species are more polymorphic than others or how different organisms tend to favor some types of polymorphism among others, when solved, have the potential to shed light on important problems in human medicine and disease research.
Some of these studies require more diverse species and/or individuals to be sequenced. Of particular interest are species with the highest rates of polymorphisms. For instance, the sequencing of the sea squirt genome lead to exciting studies that would not be possible to conduct on species that possess lower levels of polymorphism. Such studies form the motivation of this thesis.
Sequencing of genomes is, nonetheless, subject to its own research. Recent advances in DNA sequencing technology enabled researchers to lead an unprecedented amount of sequencing projects. These improvements in cost and abundance of sequencing revived a greater interest in advancing the algorithms and tools used for genome assembly. A majority of these tools, however, have no or little support for highly polymorphic genomes; which, we believe, require specialized methods.
In this thesis, we look at challenges imposed by polymorphism on genome assembly and develop methods for polymorphic genome assembly via an overview of current and past methods. Though we borrow fundamental ideas from the literature, we introduce several novel concepts that can be useful not only for assembly of highly polymorphic genomes but also genome assembly and analysis in general.
|
Page generated in 0.0257 seconds