101 |
The Multilingual Forest : Investigating High-quality Parallel Corpus DevelopmentAdesam, Yvonne January 2012 (has links)
This thesis explores the development of parallel treebanks, collections of language data consisting of texts and their translations, with syntactic annotation and alignment, linking words, phrases, and sentences to show translation equivalence. We describe the semi-manual annotation of the SMULTRON parallel treebank, consisting of 1,000 sentences in English, German and Swedish. This description is the starting point for answering the first of two questions in this thesis. What issues need to be considered to achieve a high-quality, consistent,parallel treebank? The units of annotation and the choice of annotation schemes are crucial for quality, and some automated processing is necessary to increase the size. Automatic quality checks and evaluation are essential, but manual quality control is still needed to achieve high quality. Additionally, we explore improving the automatically created annotation for one language, using information available from the annotation of the other languages. This leads us to the second of the two questions in this thesis. Can we improve automatic annotation by projecting information available in the other languages? Experiments with automatic alignment, which is projected from two language pairs, L1–L2 and L1–L3, onto the third pair, L2–L3, show an improvement in precision, in particular if the projected alignment is intersected with the system alignment. We also construct a test collection for experiments on annotation projection to resolve prepositional phrase attachment ambiguities. While majority vote projection improves the annotation, compared to the basic automatic annotation, using linguistic clues to correct the annotation before majority vote projection is even better, although more laborious. However, some structural errors cannot be corrected by projection at all, as different languages have different wording, and thus different structures. / I denna doktorsavhandling utforskas skapandet av parallella trädbanker. Dessa är språkliga data som består av texter och deras översättningar, som har märkts upp med syntaktisk information samt länkar mellan ord, fraser och meningar som motsvarar varandra i översättningarna. Vi beskriver den delvis manuella uppmärkningen av den parallella trädbanken SMULTRON, med 1.000 engelska, tyska och svenska meningar. Denna beskrivning är utgångspunkt för att besvara den första av två frågor i avhandlingen. Vilka frågor måste beaktas för att skapa en högkvalitativ parallell trädbank? De enheter som märks upp samt valet av uppmärkningssystemet är viktiga för kvaliteten, och en viss andel automatisk bearbetning är nödvändig för att utöka storleken. Automatiska kvalitetskontroller och automatisk utvärdering är av vikt, men viss manuell granskning är nödvändig för att uppnå hög kvalitet. Vidare utforskar vi att använda information som finns i uppmärkningen, för att förbättra den automatiskt skapade uppmärkningen för ett annat språk. Detta leder oss till den andra av de två frågorna i avhandlingen. Kan vi förbättra automatisk uppmärkning genom att överföra information som finns i de andra språken? Experimenten visar att automatisk länkning som överförs från två språkpar, L1–L2 och L1–L3, till det tredje språkparet, L2–L3, får förbättrad precision, framför allt för skärningspunkten mellan den överförda länkningen och den automatiska länkningen. Vi skapar även en testsamling för experiment med överföring av uppmärkning för att lösa upp strukturella flertydigheter hos prepositionsfraser. Överföring enligt majoritetsprincipen förbättrar uppmärkningen, jämfört med den grundläggande automatiska uppmärkningen, men att använda språkliga ledtrådar för att korrigera uppmärkningen innan majoritetsöverföring är ännu bättre, om än mer arbetskrävande. Vissa felaktiga strukturer kan dock inte korrigeras med hjälp av överföring, eftersom de olika språken använder olika formuleringar, och därmed har olika strukturer.
|
102 |
Reasoning About Multi-stage ProgramsInoue, Jun 24 July 2013 (has links)
Multi-stage programming (MSP) is a style of writing program
generators---programs which generate programs---supported by special
annotations that direct construction, combination, and execution of
object programs. Various researchers have shown MSP to be effective
in writing efficient programs without sacrificing genericity.
However, correctness proofs of such programs have so far received
limited attention, and approaches and challenges for that task have
been largely unexplored. In this thesis, I establish formal
equational properties of the multi-stage lambda calculus and related
proof techniques, as well as results that delineate the intricacies
of multi-stage languages that one must be aware of.
In particular, I settle three basic questions that naturally arise
when verifying multi-stage functional programs. Firstly, can adding
staging MSP to a language compromise the interchangeability of terms
that held in the original language? Unfortunately it can, and more
care is needed to reason about terms with free variables. Secondly,
staging annotations, as the term ``annotations'' suggests, are often
thought to be orthogonal to the behavior of a program, but when is
this formally guaranteed to be the case? I give termination
conditions that characterize when this guarantee holds. Finally, do
multi-stage languages satisfy extensional facts, for example that
functions agreeing on all arguments are equivalent? I develop a
sound and complete notion of applicative bisimulation, which can
establish not only extensionality but, in principle, any other valid
program equivalence as well. These results improve our general
understanding of staging and enable us to prove the correctness of
complicated multi-stage programs.
|
103 |
Method-Specific Access Control in Java via Proxy Objects using AnnotationsZarnett, Jeffrey January 2010 (has links)
Partially restricting access to objects enables system designers to finely control the security of their systems. We propose a novel approach that allows granting partial access at method granularity on arbitrary objects to remote clients, using proxy objects.
Our initial approach considers methods to be either safe (may be invoked by anyone) or unsafe (may be invoked only by trusted users). We next generalize this approach by supporting Role-Based Access Control (RBAC) for methods in objects. In our approach, a policy implementer annotates methods, interfaces, and classes with roles. Our system automatically creates proxy objects for each role, which contain only methods to which that role is authorized.
This thesis explains the method annotation process, the semantics of annotations,
how we derive proxy objects based on annotations, and how clients invoke
methods via proxy objects. We present the advantages to our approach, and
distinguish it from existing approaches to method-granularity access control. We provide detailed semantics of our system, in First Order Logic, to describe its operation.
We have implemented our system in the Java programming language and evaluated its performance and usability. Proxy objects have minimal overhead: creation of a proxy object takes an order of magnitude less time than retrieving a reference to a remote object. Deriving the interface---a one-time cost---is on the same order as retrieval. We present empirical evidence of the effectiveness of our approach by discussing its application to software projects that range from thousands to hundreds of thousands of lines of code; even large software projects can be annotated in less than a day.
|
104 |
Method-Specific Access Control in Java via Proxy Objects using AnnotationsZarnett, Jeffrey January 2010 (has links)
Partially restricting access to objects enables system designers to finely control the security of their systems. We propose a novel approach that allows granting partial access at method granularity on arbitrary objects to remote clients, using proxy objects.
Our initial approach considers methods to be either safe (may be invoked by anyone) or unsafe (may be invoked only by trusted users). We next generalize this approach by supporting Role-Based Access Control (RBAC) for methods in objects. In our approach, a policy implementer annotates methods, interfaces, and classes with roles. Our system automatically creates proxy objects for each role, which contain only methods to which that role is authorized.
This thesis explains the method annotation process, the semantics of annotations,
how we derive proxy objects based on annotations, and how clients invoke
methods via proxy objects. We present the advantages to our approach, and
distinguish it from existing approaches to method-granularity access control. We provide detailed semantics of our system, in First Order Logic, to describe its operation.
We have implemented our system in the Java programming language and evaluated its performance and usability. Proxy objects have minimal overhead: creation of a proxy object takes an order of magnitude less time than retrieving a reference to a remote object. Deriving the interface---a one-time cost---is on the same order as retrieval. We present empirical evidence of the effectiveness of our approach by discussing its application to software projects that range from thousands to hundreds of thousands of lines of code; even large software projects can be annotated in less than a day.
|
105 |
NOVEL APPROACH TO STORAGE AND STORTING OF NEXT GENERATION SEQUENCING DATA FOR THE PURPOSE OF FUNCTIONAL ANNOTATION TRANSFERCandelli, Tito January 2012 (has links)
The problem of functional annotation of novel sequences has been a sigfinicant issue for many laboratories that decided to apply next generation sequencing techniques to less studied species. In particular experiments such as transcriptome analysis heavily suer from this problem due to the impossibility of ascribing their results in a relevant biological context. Several tools have been proposed to solve this problem through homology annotation transfer. The principle behind this strategy is that homologous genes share common functions in dierent organisms, and therefore annotations are transferable between these genes. Commonly, BLAST reports are used to identify a suitable homologousgene in a well annotated species and the annotation is then transferred fromthe homologue to the novel sequence. Not all homologues, however, possess valid functional annotations. The aim of this project was to devise an algorithm to process BLAST reports and provide a criterion to discriminate between homologues with a biologically informative and uninformative annotation, respectively. In addition, all data obtained from the BLAST report isto be stored in a relational database for ease of consultation and visualization. In order to test the solidity of the system, we utilized 750 novel sequences obtained through application of next generation sequencing techniques to Avena sativa samples. This species particularly suits our needs as it represents the typical target for homology annotation transfer: lack of a reference genome and diculty in attributing functional annotation. The system was able to perform all the required tasks. Comparisons between best hits asdetermined by BLAST and best hits as determined by the algorithm showed a significant increase in the biological significance of the results when thealgorithm sorting system was applied.
|
106 |
L'annotation pour la recherche d'information dans le contexte d'intelligence économiqueRobert, Charles 16 February 2007 (has links) (PDF)
Nous pensons que l'annotation devrait contribuer à la transformation de l'information collectée en des informations à valeur ajoutée qui seront plus adaptées pour la prise de décision.<br />Nous considérerons l'annotation dans le processus d'intelligence économique en fonction de la période de l'annotation, des utilisateurs et des documents. Les annotations sur un ou plusieurs documents, par un ou plusieurs utilisateurs, peuvent être utilisées pour évaluer l'orientation et l'intérêt des individus lorsqu'ils tentent de résoudre un problème décisionnel. <br />L'ensemble des annotations peut être représenté comme {Ai, l'ensemble des annotations; Ui, l'ensemble des utilisateurs; Tj, périodes des annotations; et Dk l'ensemble des documents} et nous l'avons appelé AMIE.<br />Les paramètres Ui, Tj, Dk peuvent être fixes ou variés afin d'obtenir les annotations pour la prise de décision.<br />Nous avons développé et expérimenté le modèle par une application au domaine d'accès aux ressources d'information sur Internet
|
107 |
Comparative Genomics in Two Dicot Model SystemsPark, Gyoungju Nah January 2008 (has links)
Comparative sequence analyses were performed with members of the Solanaceae and the Brassicaceae. These studies investigated genomic organization, determined levels of microcolinearity, identified orthologous genes and investigated the molecular basis of trait differences. The first analysis was performed by comparison of tomato (Solanum lycopersicum) genomic sequence (119 kb) containing the JOINTLESS1 (J1) locus with orthologous sequences from two potato species, a diploid, Solanum bulbocastanum (800-900 Mb, 2N=2X=24), and a hexaploid, Solanum demissum (2,700 Mb, 2N=6X=72). Gene colinearity was well maintained across all three regions. Twelve orthologous open reading frames were identified in identical order and orientation and included three putative J1 orthologs with 93-96% amino acid sequence identity in both potato species. Although these regions were highly conserved, several local disruptions were detected and included small-scale expansion/contraction regions with intergenic sequences, non-colinear genes and transposable elements. Three putative Solanaceous-specific genes were also identified in this analysis. The second analysis was performed by comparison of a Thellungiella halophila (T. halophila) genomic sequence (193 kb) containing the SALT OVERLY SENSITIVE1 (SOS1) locus with the orthologous sequence (146 kb) in Arabidopsis thaliana (Arabidopsis). T. halophila is a halophytic relative of Arabidopsis thaliana that exhibits extreme salt tolerance. Twenty-five genes, including the putative T. halophila SOS1 (ThSOS1), showed a high degree of colinearity with Arabidopsis genes in the corresponding region. Although the two sequences were significantly colinear, several local rearrangements were detected which were caused by tandem duplications and inversions. Three major expansion/contraction regions in T. halophila contained five LTR retrotransposons which contributed to genomic size variation in this region. ThSOS1 shares similar gene structure and sequence with Arabidopsis SOS1 (AtSOS1), including 11 transmembrane domains and a cyclic nucleotide-binding domain. Three Simple Sequence Repeats (SSRs) were detected within a 540 bp region upstream of the putative translational start site in ThSOS1. The (CTT)n repeat is present in different copy numbers in ThSOS1 (18 repeats) and AtSOS1 (3 repeats). When present in the 5' UTRs of some Arabidopsis genes, (CTT)n serves as a putative salicylic acid responsive element. These SSRs may serve as cis-acting elements affecting differential mRNA accumulation of SOS1 in the two species.
|
108 |
Gene Ontology-based framework to annotate genes of hearingOvezmyradov, Guvanchmyrat 23 October 2012 (has links)
No description available.
|
109 |
In-silico characterization and prediction of protein-small ligand interactionsChen, Ke Unknown Date
No description available.
|
110 |
Correlating illustrations and text through interactive annotation computer-aided support for textbooksGötzelmann, Timo January 1900 (has links)
Zugl.: Magdeburg, Univ., Diss., 2007 / Hergestellt on demand
|
Page generated in 0.0958 seconds