Global ETD Search

411	Making substitution matrices metric Anfinsen, Jarle January 2005 (has links) <p>With the emergence and growth of large databases of information, efficient methods for storage and processing are becoming increasingly important. The existence of a metric distance measure between data entities enables efficient index structures to be applied when storing the data. Unfortunately, this is often not the case. Amino acid substitution matrices, which are used to estimate similarities between proteins, do not yield metric distance measures. Finding efficient methods for converting a non-metric matrix into a metric one is therefore highly desirable. In this work, the problem of finding such conversions is approached by embedding the data contained in the non-metric matrix into a metric space. The embedding is optimized according to a quality measure which takes the original data into account, and a distance matrix is then derived using the metric distance function of the space. More specifically, an evolutionary scheme is proposed for constructing such an embedding. The work shows how a coevolutionary algorithm can be used to find a spatial embedding and a metric distance function which try to preserve as much of the proximity structure of the non-metrix matrix as possible. The evolutionary scheme is compared to three existing embedding algorithms. Some modifications to the existing algorithms are proposed, with the purpose of handling the data in the non-metric matrix more efficiently. At a higher level, the strategy of deriving a metric distance function from a spatial embedding is compared to an existing algorithm which enforces metricity by manipulating the data in the non-metric matrix directly (the triangle fixing algorithm). The methods presented and compared are general in the sense that they can be applied in any case where a non-metric matrix must be converted into a metric one, regardless of how the data in the non-metric matrix was originally derived. The proposed methods are tested empirically on amino acid substitution matrices, and the derived metric matrices are used to search for similarity in a database of proteins. The results show that the embedding approach outperforms the triangle fixing approach when applied to matrices from the PAM family. Moreover, the evolutionary embedding algorithms perform best among the embedding algorithms. In the case of the PAM250 scoring matrix, a metric distance matrix is found which is more sensitive than the mPAM250 matrix presented in a recent paper. Possible advantages of choosing one method over another are shown to be unclear in the case of matrices from the BLOSUM family.</p> ntnudaim SIF2 datateknikk Program- og informasjonssystemer
412	MOWAHS - Optimised support for XML in mobile environments Walla, Anders Kristian Harang January 2005 (has links) <p>This report describes a prototype middleware system for optimising transfer and processing times of XML based data between mobile, heterogeneous clients, supporting servers and context providers. The system will achieve these objectives by compressing or compacting the XML data in different ways, and using different parsing techniques. Two such techniques are examined more thoroughly, namely tag redundancy reduction, and binary compression. These optimisation techniques are implemented in a fully functioning XML data optimising system, and their effectiveness is tested and compared. A long term goal is discussed and considered in relation to these techniques: To develop a set of heuristic rules that will allow the system to determine dynamically which optimisation methods are most efficient at any given time based on available context data. The prototype system described is developed in Java, with a client for mobile devices written in Java2ME.</p> ntnudaim SIF2 datateknikk Program- og informasjonssystemer
413	Towards improving an organization's ability to procure software intensive systems : A survey Engene, Knut Steinar January 2005 (has links) <p>This report presents a three-step investigation conducted to identify problems and challenges experienced by small and medium sized organizations procuring software intensive systems. Archival research is carried out to see if the available procurement guidelines are applicable for small and medium sized organizations. Data has been collected through questionnaires and interviews with the organizations employees responsible for software procurements. The quantitative data has been analyzed using statistical methods, in an attempt to identify the main weaknesses in the current procurement procedures. In addition, the qualitative data are analyzed to complement the findings made from the quantitative data. Results indicate that the organizations who participated in the survey seldom follow a predefined procedure when they execute software procurements. However, organizations that do have a defined, formalized procurement procedure are significantly more satisfied with their procurements. In addition, risk management is seldom integrated in software procurements despite the fact that the organizations to some extent consider software procurement as a risky activity. Recommendations derived from the survey results are offered to increase the organizations ability to procure and use software intensive systems.</p> ntnudaim SIF2 datateknikk Program- og informasjonssystemer
414	An improved web-based solution for specifying transaction models for CAGISTrans Bjørge, Thomas Eugen January 2005 (has links) <p>Transactions have been used for several decades to handle concurrent access to data in databases. These transactions adhere to a strict set of transactional rules that ensure that the correctness of the database is maintained. But transactions are also useful in other settings such as supporting cooperative work over computer networks like the Internet. However the original transaction model is too strict for this. To enable cooperation between transactions on shared objects, a framework for specifying and executing transaction models adapted to the environment in which they are running has been developed. Additionally, a web based user interface for the specification of transaction models for the framework has also been created. In this thesis we look at how the process of specifying transaction models for the framework can be improved. More specifically, we start by carefully reviewing the current web based solution for specifying transaction models. In our review we focus on usability, design and the technical aspects of the solution. We then continue with a thorough look at Web Services in the context of the transaction model framework. Our main objective at this stage is evaluating the possibility of implementing a new solution for specifying transaction models using Web Services. The last part of our work is the actual implementation of an improved application for specifying transaction models. This implementation is based on the results from our evaluation of the current solution and our evaluation of Web Services. We identified several issues in our review of the current solution. The main problem is that it is difficult for the user to get a good overview of the transaction model she is creating during the specification process. This is due to the lack of a visual representation of the model. The specification process is also very tedious containing a large number of steps, a number we feel can be reduced. The technical aspects of the solution also have a lot of room for improvement. The overall design can easily be improved, and additionally utilizing different technologies would make the application less error prone, and also easier to maintain and update. We also reached the conclusion that Web Services is not an ideal technology for a transaction model specification application. The main problem is that the client needs to have a complete overview over the specification process leading to a lot of duplication of data between the client and the web service. In the end this situation leads to a very complex web service that does not improve the transaction model specification process. Based on our results, we decided to implement a web based solution for specifying transaction models. Our solution is similar to the original one, but we had strong focus on improving its shortcomings, both on the usability side and the technical side. This meant focusing on giving the user a good overview of the transaction model during the specification process and also reducing the number of steps in the process. Additionally, we put a lot of effort into developing a solution that is based on technological best practices, leading to a solution that is less error prone than the original solution. It should also be easier to maintain and update.</p> ntnudaim SIF2 datateknikk Program- og informasjonssystemer
415	EventSeer: Testing Different Approaches to Topical Crawling for Call for Paper Announcements Brennhaug, Knut Eivind January 2005 (has links) <p>The goal of the Eventseer project is to build a digital library of call for paper announcements. Today call for papers are collected from different mailing lists, the goal is to develop topical crawlers so that Eventseer also may collect call for paper announcements from the Web.</p> ntnudaim SIF2 datateknikk Program- og informasjonssystemer
416	Identification of biomedical entities from Medline abstracts using a dictionary-based approach Skuland, Magnus January 2005 (has links) <p>The aim of this paper was to develop a system for identification of biomedical entities, such as protein and gene names, from a corpora of Medline abstracts. Another aim was to manage to extract the most relevant terms from the set of identified biomedical terms and make them readily presentable for an end-user. The developed prototype, named iMasterThesis, uses a dictionary-based approach to the problem. A dictionary, consisting of 21K gene names and 425K protein names, was constructed in an automatic fashion. With the realization of the protein name dictionary as a multi-level tree structure of hash tables, the approach tries to facilitate a more flexible and relaxed matching scheme than previous approaches. The system was evaluated against a golden standard consisting of 101 expert-annotated Medline abstracts. It is capable of identifying protein and gene names from these abstracts with a 10% recall and 14% precision. It seems clear that for further improvements of the obtained results, the quality of the dictionary needs to be increased, possibly through manual inspection by domain experts. A graphical user interface, presenting an end-user with the most relevant terms identified, has been developed as well.</p> ntnudaim SIF2 datateknikk Program- og informasjonssystemer
417	A study of online construction of fragment replicas Torres Pizzorno, Fernanda January 2005 (has links) <p>High availability in database systems is achieved using data replication and online repair. On a system containing 2 replicas of each fragment, the loss of a fragment replica due to a node crash makes the system more vulnerable. In such a situation, only one replica of the fragments contained in the crashed node will be available until a new replica is generated. In this study we have investigated different methods of regenerating a new fragment replica that is up to date with the transactions that have happened during the process of regenerating it. The objective is to determine which method performs the best in terms of completion time at each of the nodes involved, in different conditions. We have investigated three different methods for sending the data from the node containing the primary fragment replica to the node being repaired, and one method for catching-up with the transactions executed at the node containing the primary fragment replica during the repair process. These methods assume that the access method used by the DB system is B-trees. The methods differ by the volume of data sent over the network, and by the work (and time) needed to prepare the data prior to sending. They consist respectively in sending the entire B-tree, sending the leaves of the B-tree only, and sending the data only; the latter has two alternatives on the node being repaired, depending on whether the data is being inserted into a new B-tree, or whether the B-tree is being regenerated from the leaf-level and up. This study shows that the choice of recovery method should be made considering the network configuration that will be used. For common network configurations like 100Mbits or lower, it is interesting to use methods that minimize the volume of data transfered. For higher network bandwidth, it is more important to minimize the amount of work done at the nodes.</p> ntnudaim SIF2 datateknikk Program- og informasjonssystemer
418	Development of a Semantic Web Solution for Directory Services Buil Aranda, Carlos January 2005 (has links) <p>The motivation for this work is based in a common problem in organizations. The problem is to access and to manage the growing amount of stored data in companies. Companies can take advantage with the utilization of the emerging Semantic Web technology in order to solve this problem. Invenio AS is in a situation where it is necessary to access a directory service in an efficient way and the Semantic Web languages can be used to solve it. In this thesis, a literature study has been done, an investigation about the main ontology languages proposed by World Wide Web Consortium, RDF(S) and OWL with its extension for Web services OWL-S and the ontology language proposed by the International Organization for Standardization, Topic Maps. This literature study can be used like an introduction to these Web ontology languages RDF, OWL (and OWL-S) and Topic Maps. A model of the databases has been extracted and designed in UML. The extracted model has been used to create a common ontology, merging both the initial databases. The ontology that represents the database in the three languages has been analysed. The quality and semantic accuracy of the languages for the Invenio case has been analysed and we have obtained detailed results from this analysis.</p> ntnudaim SIF2 datateknikk Program- og informasjonssystemer
419	Knowledge Transfer in Open Source Communities of Practice : The Case of Azureus Evans, Peter John Dalland January 2005 (has links) <p>This paper discusses knowledge sharing dynamics in open source communities of practice based on an empirical study of an open source project. The paper describes how the online community in the study displayed many characteristics of an ongoing community of practice (Lave and Wenger 1991), as well as the distinct role technology and artefacts played in collaboration within the community. It is shown that while the theory of communities of practice captures many important aspects of learning and knowledge sharing in the project, it neglects the role of artefacts and the way they can contribute to these dynamics. Concepts of knowledge and knowledge transfer are discussed in order to explain aspects of these, relevant to the observations made in the study. The purpose of the paper is to offer practical and theoretical contributions to understanding distributed knowledge transfer, as well as characteristics of open source development.</p> ntnudaim SIF2 datateknikk Program- og informasjonssystemer
420	Feature selection in Medline using text and data mining techniques Strand, Lars Helge January 2005 (has links) <p>In this thesis we propose a new method for searching for gene products gene products and give annotations associating genes with Gene Ontology codes. Many solutions already exists, using different techniques, however few are capable of addressing the whole GO hierarchy. We propose a method for exploring this hierarchy by dividing it into subtrees, trying to find terms that are characteristics for the subtrees involved. Using a feature selection based on chi-square analysis and naive Bayes classification to find the correct GO nodes.</p> ntnudaim SIF2 datateknikk Program- og informasjonssystemer

Search results