Spelling suggestions: "subject:"computerscience"" "subject:"composerscience""
361 |
Building an Essential Gene Classification FrameworkSaha, Soma 05 January 2006 (has links)
The analysis of gene deletions is a fundamental approach for investigating gene function. We applied machine learning techniques to predict phenotypic effects of gene deletions in yeast. We created a dataset containing features that potentially have predictive power and then used feature processing techniques to improve the dataset and identify features that are important for our classification problem. We evaluated four different classification algorithms, K-Nearest Neighbors, Support Vector Machine, Decision Tree, and Random Forest, with respect to this problem. We used our framework to complement the set of experimentally determined essential yeast genes produced by the Saccharomyces Genome Deletion Project and produce more than 2000 annotations for genes that might cause morphological alterations in yeast.
|
362 |
Deriving Efficient SQL Sequences Via PrefetchingBilgin, Ahmet Soydan 04 January 2008 (has links)
Modern information architectures place business logic in an application server and persistent objects in a relational DBMS. To effectively realize such architectures, we must surmount the problem of effectively fetching objects from DBMS to the application server. Object access patterns are not random; they are driven by applications and user behaviors. Naive implementations retrieve objects from the DBMS as such objects are requested by the application, costing a DBMS roundtrip for each query. This fact, coupled with the growing performance bottleneck of computer storage systems, has resulted in a significant amount of research improving object access behavior through predicting future access objects because throughput will continue to improve but latency will not. Latency will be an ever-increasing component of object access cost. In turn, object access cost is usually the bottleneck for modern high performance systems. This yields unacceptably poor performance when application server submit sequence of relational queries to DBMSs. A reasonable approach would be to generate prefetch queries that retrieve objects that would later be requested by the application. However, whereas some prefetch queries would be beneficial, some would not. Distinguishing between them is nontrivial in practice, because commercial DBMSs do not expose efficient query response-time estimators. First, there is no standardized interface for an application server to force the database system to calculate the costs (e.g., response time) for a given query. Second, we still have the entire roundtrip costs between application servers and the DBMSs to estimate the total cost of query evaluation. Consequently, in current practice, programmers spend enormous amounts of time tuning the queries by which objects are retrieved by an application. This dissertation develops an application-independent approach for generating prefetch queries that can be implemented in conventional middleware systems. The main contribution of this dissertation is a set of application-independent guidelines for selecting, based on application's access patterns and additional parameters, efficient ways of merging the application's data requests into prefetch queries. Our guidelines take the current configuration such as local or wide area networks into account, allowing it to select strategies that give good performance in a wider range of configurations. The ensuing performance gains are evaluated via realistic settings based on a retail database inspired by the SPECJ performance test suite.
|
363 |
The Terminal Node Controlled Routing Protocol for Mobile Ad Hoc NetworksRaghavan, Sudarshan Narasimha 20 January 2003 (has links)
The purpose of the research has been to study the influence of unidirectional links on routing in mobile ad hoc networks (MANET) and to develop a new protocol for routing in ad hoc networks with unidirectional links. In this thesis, we propose and discuss a Terminal Node Controlled Routing Protocol (TNCR) for mobile ad hoc networks. TNCR is based on the concept of source routing and provides optimal routes on an on-demand basis by utilizing unidirectional links in the network. Unlike existing protocols, TNCR uses an end-to-end probing technique for discovering path failures. Propagation of stale route information is accomplished using a source initiated token passing mechanism. A source tagging technique is used for caching routes to enable nodes to learn new routes and flush stale routes from the cache. The protocol uses a reverse caching technique to reduce the number of network-wide broadcasts required for route establishment and a broadcast limiting mechanism to reduce the overhead due to broadcast messages. In order to decrease the per packet overhead due to source routing, TNCR employs a route stripping technique for forwarding unicast packets. The protocol avoids the use of link layer acknowledgements and the intervention of intermediate nodes in route maintenance. Thus it provides a highly efficient, link layer independent framework for routing in ad hoc networks in the presence of unidirectional links. This framework is best suited for ad hoc networks composed of heterogeneous mobile nodes with varying transmission ranges. In addition, the protocol can be configured to operate in a bi-directional mode in which it uses a path reversal mechanism for further optimization and overhead reduction. This makes the protocol well suited for all network topologies. We evaluate the performance of the proposed protocol through detailed simulation, measurements and comparison with existing protocols.
|
364 |
Feasibility study of secure and robust location determination in current generation of wireless sensor networksShah, Pratik 06 January 2006 (has links)
Location Determination has been a fundamental requirement in many wireless sensor network applications. Various schemes have been proposed to solve this problem. These schemes depend on the measurement of physical quantities such as time of flight, angle of arrival, time difference of arrival and signal strength for location determination. Measurements in the real world are also affected by environmental conditions and contain unavoidable errors. Statistical techniques such as MMSE have been shown to be tolerant towards such errors. However in hostile environments, attackers can alter the measurements significantly to render these proposed schemes useless. Security mechanisms such as authentication and encryption can thwart external attacks such as eavesdropping and spoofing. However, attacks specific to location determination schemes differ from conventional security attacks and have been shown to be successful even when adequate security mechanisms are in place. Recently AR-MMSE, LMS and Voting-based schemes have been proposed to resist these attacks. A technique has also been proposed for detection of attacker nodes. This thesis presents the design and implementation of a nesC [[8] library that achieves secure and robust location determination using these techniques and provides a simple interface that can be used by high level applications. A working system was built using Cricket sensors to evaluate the feasibility of the techniques along with basic security mechanisms. We measure the tradeoffs between the time required for computation, memory consumption and the accuracy of the estimated location. We also measure the accuracy of the estimated location under various degrees of attack for both 2-dimensional and 3- dimensional scenarios. Our experimental results show that in a 2-dimensional system, even with 2 malicious Beacon Nodes out of 8, the maximum increase in error is less than 8 cm for all three techniques when the maximum error is 2 cm without any malicious Beacon Nodes. In case of 3-dimensional system with 1 malicious Beacon Node out of 8, the maximum increase is less than 20 cm with maximum error of about 10 cm when no malicious Beacon Nodes are present.
|
365 |
P-Coffee: a new divide-and-conquer method for multiple sequence alignmentChoi, Kwangbom 19 January 2005 (has links)
We describe a new divide-and-conquer method, P-Coffee, for alignment of multiple sequences. P-Coffee first identifies candidate alignment columns using a position-specific substitution matrix (the T-Coffee extended library), tests those columns, and accepts only qualified ones. Accepted columns do not only constitute a final alignment solution, but also divide a given sequence set into partitions. The same procedure is recursively applied to each partition until all the alignment columns are collected. In P-Coffee, we minimized the source of bias by aligning all the sequences simultaneously without requiring any heuristic function to optmize, phylogenetic tree, nor gap cost scheme. In this research, we show the performance of our approach by comparing our results with that of T-Coffee using the 144 test sets provided in BAliBASE v1.0. P-Coffee outperformed T-Coffee in accuracy especially for more complicated test sets.
|
366 |
Reconstruction of Ground Penetrating Radar Images using techniques based on Optimization.Suvarna, Sushil Sheena 29 January 2004 (has links)
Ground Penetrating Radar (GPR) is an instrument used in semi-automated construction systems. In principal, images of subsurface objects such as pipes and mines may be detected and potentially measured. The detection of utilities is complicated by a combination of the complexity involved in the data collection technique of the GPR and the irregularities present beneath the surface. This thesis provides the initial results in the development of an algorithm to invert the effects of these corruptions and return images, which are exact in the placement and conformation of subterranean objects. The technique employed is a deconvolution-like method that utilizes a maximum a posteriori (MAP) based optimization method to estimate the best reconstruction. Mean field annealing (MFA) using gradient descent is the optimization method used. Using this technique, single objects in the field of observation were reconstructed to within an acceptable percentage of their original shape. Further work would involve reconstructing multiple objects in the field of observation as well as considering features other than hyperbolae that correspond to objects.
|
367 |
Argument Generation for a Biomedical DomainNavoraphan, Kanyamas 30 January 2008 (has links)
Discourse generation is a critical task of natural language generation. In this thesis, we introduce an approach to discourse generation for qualitative causal probabilistic domains that incorporates argumentation into the generation process. The discourse generation process uses three modules: a qualitative causal probabilistic domain model, a genre-specific discourse grammar, and a normative argument generator. The model of discourse generation has been implemented for the domain of clinical genetics. In conjunction with GenIE, a prototype intelligent system for generating the first draft of a patient letter on behalf of a genetic counselor, the discourse grammar exploits general information about clinical genetics as well as documentation about a specific patient's case provided by a genetic counselor, to create discourse plans. The argument generator generates arguments for the claims passed to it from the discourse grammar using domain-independent argument strategies. An important contribution of the thesis is a modification of the argument generator to support interactive argument exploration.
|
368 |
Legal Requirements Acquisition for the Specification of Legally Compliant Information SystemsBreaux, Travis 22 April 2009 (has links)
U.S. Federal and state regulations impose mandatory and discretionary requirements on industrywide business practices to achieve non-functional, societal goals such as improved accessibility, privacy and safety. The structure and syntax of regulations affects how well software engineers identify and interpret legal requirements. Inconsistent interpretations can lead to noncompliance and violations of the law. To support software engineers who must comply with these regulations, I propose a Frame-Based Requirements Analysis Method (FBRAM) to acquire and specify legal requirements from U.S. federal regulatory documents. The legal requirements are systematically specified using a reusable, domain-independent upper ontology, natural language phrase heuristics, a regulatory document model and a frame-based markup language. The methodology maintains traceability from regulatory statements and phrases to formal properties in a frame-based model and supports the resolution of multiple types of legal ambiguity. The methodology is supported by a software prototype to assist engineers with applying the model and with analyzing legal requirements. This work is validated in three domains, information privacy, information accessibility and aviation safety, which are governed by the Health Insurance Portability and Accountability Act of 1996, the Rehabilitation Act Amendments of 1998, and the Federal Aviation Act of 1958, respectively.
|
369 |
Hamilton Cycle Heuristics in Hard GraphsShields, Ian Beaumont 23 March 2004 (has links)
In this thesis, we use computer methods to investigate Hamilton cycles and paths in several families of graphs where general results are incomplete, including Kneser graphs, cubic Cayley graphs and the middle two levels graph. We describe a novel heuristic which has proven useful in finding Hamilton cycles in these families and compare its performance to that of other algorithms and heuristics. We describe methods for handling very large graphs on personal computers. We also explore issues in reducing the possible number of generating sets for cubic Cayley graphs generated by three involutions.
|
370 |
Improving Query Performance using Materialized XML Views: A Learning-based approachShah, Ashish Narendra 20 March 2004 (has links)
This thesis presents a novel approach in solving the problem of improving the efficiency of query processing on an XML interface of a relational database for frequent and important queries. The motivation of this research is provided by the need to eliminate processing overheads in converting relational data to an XML format by materializing beforehand answers to frequent and important queries (which we predefine as a query workload) in terms of an XML structure. The main contribution of this paper is to show that selective materialization of data as XML views reduces query-execution costs for the workload queries, in relatively static databases. Our learning-based approach precomputes and stores (materializes) parts of the answers to the workload queries as clustered XML views. In addition, the data in the materialized XML clusters are periodically incrementally refreshed and rearranged, to respond to the changes in the query workload. We use a collection of music data as a sample database to build our learning-based system. Our experiments show that the approach can significantly reduce processing costs for frequent and important queries on relational databases with XML interfaces.
|
Page generated in 0.0926 seconds