Global ETD Search

1	PELICAN : a PipELIne, including a novel redundancy-eliminating algorithm, to Create and maintain a topicAl family-specific Non-redundant protein database Andersson, Christoffer January 2005 (has links) The increasing number of biological databases today requires that users are able to search more efficiently among as well as in individual databases. One of the most widespread problems is redundancy, i.e. the problem of duplicated information in sets of data. This thesis aims at implementing an algorithm that distinguishes from other related attempts by using the genomic positions of sequences, instead of similarity based sequence comparisons, when making a sequence data set non-redundant. In an automatic updating procedure the algorithm drastically increases the possibility to update and to maintain the topicality of a non-redundant database. The procedure creates a biologically sound non-redundant data set with accuracy comparable to other algorithms focusing on making data sets non-redundant redundancy BLAT genomic positions profile hidden Markov models G-protein coupled receptors Bioinformatics Bioinformatik
2	PELICAN : a PipELIne, including a novel redundancy-eliminating algorithm, to Create and maintain a topicAl family-specific Non-redundant protein database Andersson, Christoffer January 2005 (has links) <p>The increasing number of biological databases today requires that users are able to search more efficiently among as well as in individual databases. One of the most widespread problems is redundancy, i.e. the problem of duplicated information in sets of data. This thesis aims at implementing an algorithm that distinguishes from other related attempts by using the genomic positions of sequences, instead of similarity based sequence comparisons, when making a sequence data set non-redundant. In an automatic updating procedure the algorithm drastically increases the possibility to update and to maintain the topicality of a non-redundant database. The procedure creates a biologically sound non-redundant data set with accuracy comparable to other algorithms focusing on making data sets non-redundant</p> redundancy BLAT genomic positions profile hidden Markov models G-protein coupled receptors Bioinformatics Bioinformatik
3	Discovery Of Application Workloads From Network File Traces Yadwadkar, Neeraja 12 1900 (has links) (PDF) An understanding of Input/Output data access patterns of applications is useful in several situations. First, gaining an insight into what applications are doing with their data at a semantic level helps in designing efficient storage systems. Second, it helps to create benchmarks that mimic realistic application behavior closely. Third, it enables autonomic systems as the information obtained can be used to adapt the system in a closed loop. All these use cases require the ability to extract the application-level semantics of I/O operations. Methods such as modifying application code to associate I/O operations with semantic tags are intrusive. It is well known that network file system traces are an important source of information that can be obtained non-intrusively and analyzed either online or offline. These traces are a sequence of primitive file system operations and their parameters. Simple counting, statistical analysis or deterministic search techniques are inadequate for discovering application-level semantics in the general case, because of the inherent variation and noise in realistic traces. In this paper, we describe a trace analysis methodology based on Profile Hidden Markov Models. We show that the methodology has powerful discriminatory capabilities that enables it to recognize applications based on the patterns in the traces, and to mark out regions in a long trace that encapsulate sets of primitive operations that represent higher-level application actions. It is robust enough that it can work around discrepancies between training and target traces such as in length and interleaving with other operations. We demonstrate the feasibility of recognizing patterns based on a small sampling of the trace, enabling faster trace analysis. Preliminary experiments show that the method is capable of learning accurate profile models on live traces in an online setting. We present a detailed evaluation of this methodology in a UNIX environment using NFS traces of selected commonly used applications such as compilations as well as on industrial strength benchmarks such as TPC-C and Postmark, and discuss its capabilities and limitations in the context of the use cases mentioned above. File Tracing (Computer Networks) Computer Communication Profile Hidden Markov Models Sequence Alignment Network File System (NFS) Network File Traces Hidden Markov Models (HMMs) Computer Science
4	Malware Analysis using Profile Hidden Markov Models and Intrusion Detection in a Stream Learning Setting Saradha, R January 2014 (has links) (PDF) In the last decade, a lot of machine learning and data mining based approaches have been used in the areas of intrusion detection, malware detection and classification and also traffic analysis. In the area of malware analysis, static binary analysis techniques have become increasingly difficult with the code obfuscation methods and code packing employed when writing the malware. The behavior-based analysis techniques are being used in large malware analysis systems because of this reason. In prior art, a number of clustering and classification techniques have been used to classify the malwares into families and to also identify new malware families, from the behavior reports. In this thesis, we have analysed in detail about the use of Profile Hidden Markov models for the problem of malware classification and clustering. The advantage of building accurate models with limited examples is very helpful in early detection and modeling of malware families. The thesis also revisits the learning setting of an Intrusion Detection System that employs machine learning for identifying attacks and normal traffic. It substantiates the suitability of incremental learning setting(or stream based learning setting) for the problem of learning attack patterns in IDS, when large volume of data arrive in a stream. Related to the above problem, an elaborate survey of the IDS that use data mining and machine learning was done. Experimental evaluation and comparison show that in terms of speed and accuracy, the stream based algorithms perform very well as large volumes of data are presented for classification as attack or non-attack patterns. The possibilities for using stream algorithms in different problems in security is elucidated in conclusion. Malware (Malicious Software) Malware, Cyber Attacks Malware Analysis Profile Hidden Markov Models Intrusion Detection Systems Data Mining Malware Classification and Clustering Machine Learning Malware Detection Cyber Attacks Stream-based Learning Polymorphic Malware Detection Huffman Encoding Stream Algorithms Computer Science
5	Microbe-Environment Interactions in Arctic and Subarctic Systems Zayed, Ahmed Abdelfattah 30 September 2019 (has links) No description available. Biogeochemistry Bioinformatics Biology Biological Oceanography Climate Change Ecology Environmental Science Geobiology Microbiology Oceanography Soil Sciences Statistics Virology viruses microbial expression metagenomics metaproteomics community ecology population ecology community expression marine biology peat carbon degradation permafrost climate change diversity gradients Profile Hidden Markov Models

1

Page generated in 0.0682 seconds