1 |
Adaptive Comparison-Based Algorithms for Evaluating Set QueriesMirzazadeh, Mehdi January 2004 (has links)
In this thesis we study a problem that arises in answering boolean queries submitted to a search engine. Usually a search engine stores the set of IDs of documents containing each word in a pre-computed sorted order and to evaluate a query like "computer AND science" the search engine has to evaluate the union of the sets of documents containing the words "computer" and "science". More complex queries will result in more complex set expressions. In this thesis we consider the problem of evaluation of a set expression with union and intersection as operators and ordered sets as operands. We explore properties of comparison-based algorithms for the problem. A <i>proof of a set expression</i> is the set of comparisons that a comparison-based algorithm performs before it can determine the result of the expression. We discuss the properties of the proofs of set expressions and based on how complex the smallest proofs of a set expression <i>E</i> are, we define a measurement for determining how difficult it is for <i>E</i> to be computed. Then, we design an algorithm that is adaptive to the difficulty of the input expression and we show that the running time of the algorithm is roughly proportional to difficulty of the input expression, where the factor is roughly logarithmic in the number of the operands of the input expression.
|
2 |
Adaptive Comparison-Based Algorithms for Evaluating Set QueriesMirzazadeh, Mehdi January 2004 (has links)
In this thesis we study a problem that arises in answering boolean queries submitted to a search engine. Usually a search engine stores the set of IDs of documents containing each word in a pre-computed sorted order and to evaluate a query like "computer AND science" the search engine has to evaluate the union of the sets of documents containing the words "computer" and "science". More complex queries will result in more complex set expressions. In this thesis we consider the problem of evaluation of a set expression with union and intersection as operators and ordered sets as operands. We explore properties of comparison-based algorithms for the problem. A <i>proof of a set expression</i> is the set of comparisons that a comparison-based algorithm performs before it can determine the result of the expression. We discuss the properties of the proofs of set expressions and based on how complex the smallest proofs of a set expression <i>E</i> are, we define a measurement for determining how difficult it is for <i>E</i> to be computed. Then, we design an algorithm that is adaptive to the difficulty of the input expression and we show that the running time of the algorithm is roughly proportional to difficulty of the input expression, where the factor is roughly logarithmic in the number of the operands of the input expression.
|
3 |
Effect of different platforms on coupling compensation matrices in AOA estimation algorithms using small size UCAGhazaany, Tahereh S., Zhu, Shaozhen (Sharon), Jones, Steven M.R., Abd-Alhameed, Raed, Noras, James M., Van Buren, T., Marker, S. January 2014 (has links)
No / In this paper the sensitivity of the decoupling matrix used for mutual coupling compensation in small size uniform circular arrays has been studied. The compensation matrix is calculated using the receiving mode technique for a 5-element uniform circular array and applied to two groups of direction finding algorithms, namely phase comparison-based (interferometry) and subspace-based algorithms. In the tracking application considered the receiver array is deployed on a car roof or aircraft, so the geometry of the platform influences the compensation results. In this work, the effect of different ground plane geometries in terms of the standard deviation of angular error for each estimation algorithm using simulation results is investigated. The results show that the calibration conditions used to determine the compensation matrix affect the AOA estimation accuracy.
|
4 |
Shortest Path Queries in Very Large Spatial DatabasesZhang, Ning January 2001 (has links)
Finding the shortest paths in a graph has been studied for a long time, and there are many main memory based algorithms dealing with this problem. Among these, Dijkstra's shortest path algorithm is one of the most commonly used efficient algorithms to the non-negative graphs. Even more efficient algorithms have been developed recently for graphs with particular properties such as the weights of edges fall into a range of integer. All of the mentioned algorithms require the graph totally reside in the main memory. Howevery, for very large graphs, such as the digital maps managed by Geographic Information Systems (GIS), the requirement cannot be satisfied in most cases, so the algorithms mentioned above are not appropriate. My objective in this thesis is to design and evaluate the performance of external memory (disk-based) shortest path algorithms and data structures to solve the shortest path problem in very large digital maps. In particular the following questions are studied:What have other researchers done on the shortest path queries in very large digital maps?What could be improved on the previous works? How efficient are our new shortest paths algorithms on the digital maps, and what factors affect the efficiency? What can be done based on the algorithm? In this thesis, we give a disk-based Dijkstra's-like algorithm to answer shortest path queries based on pre-processing information. Experiments based on our Java implementation are given to show what factors affect the running time of our algorithms.
|
5 |
Understanding methods for internal and external preference mapping and clustering in sensory analysisYenket, Renoo January 1900 (has links)
Doctor of Philosophy / Department of Human Nutrition / Edgar Chambers IV / Preference mapping is a method that provides product development directions for developers to see a whole picture of products, liking and relevant descriptors in a target market. Many statistical methods and commercial statistical software programs offering preference mapping analyses are available to researchers. Because of numerous available options, there are two questions addressed in this research that most scientists must answer before choosing a method of analysis: 1) are the different methods providing the same interpretation, co-ordinate values and object orientation; and 2) which method and program should be used with the data provided?
This research used data from paint, milk and fragrance studies, representing complexity from lesser to higher. The techniques used are principal component analysis, multidimensional preference map (MDPREF), modified preference map (PREFMAP), canonical variate analysis, generalized procrustes analysis and partial least square regression utilizing statistical software programs of SAS, Unscrambler, Senstools and XLSTAT. Moreover, the homogeneousness of consumer data were investigated through hierarchical cluster analysis (McQuitty’s similarity analysis, median, single linkage, complete linkage, average linkage, and Ward’s method), partitional algorithm (k-means method), nonparametric method versus four manual clustering groups (strict, strict-liking-only, loose, loose-liking-only segments). The manual clusters were extracted according to the most frequently rated highest for best liked and least liked products on hedonic ratings. Furthermore, impacts of plotting preference maps for individual clusters were explored with and without the use of an overall mean liking vector.
Results illustrated various statistical software programs were not similar in their oriented and co-ordinate values, even when using the same preference method. Also, if data were not highly homogenous, interpretation could be different. Most computer cluster analyses did not segment consumers relevant to their preferences and did not yield as homogenous clusters as manual clustering. The interpretation of preference maps created by the highest homogeneous clusters had little improvement when applied to complicated data. Researchers should look at key findings from univariate data in descriptive sensory studies to obtain accurate interpretations and suggestions from the maps, especially for external preference mapping. When researchers make recommendations based on an external map alone for complicated data, preference maps may be overused.
|
6 |
Voltage control in distribution networks using on-load tap changer transformersGao, Chao January 2013 (has links)
Voltage is one of the most important parameters for electrical power networks. The Distribution Network Operators (DNOs) have the responsibility to maintain the voltage supplied to consumers within statutory limits. On-Load Tap Changer (OLTC) transformer equipped with Automatic Voltage Control (AVC) relay is the most widely used and effective voltage control device. Due to a variety of advantages of adding Distributed Generation (DG), more and more distributed resources are connected to local distribution networks to solve constraints of networks, reduce the losses from power supply station to consumers. When DG is connected, the direction of power flow can be reversed when the DG output power exceeds the local load. This means that the bidirectional power flow can either be from power grid towards loads, or vice versa. The connection point of DG may suffer overvoltage when the DG is producing a large amount of apparent power. The intermittent nature of renewable energy resources which are most frequently used in DG technology results in uncertainty of distribution network operation. Overall, conventional OLTC voltage control methods need to be changed when DG is connected to distribution networks. The required voltage control needs to address challenges outlined above and new control method need to be formulated to reduce the limitations of DG output restricted by current operational policies by DNOs. The thesis presents an analysis of voltage control using OLTC transformer with DG in distribution networks. The thesis reviews conventional OLTC voltage control schemes and existing policies of DNOs in the UK. An overview of DG technologies is also presented with their operation characteristics based on power output. The impact of DG on OLTC voltage control schemes in distribution networks is simulated and discussed. The effects of different X/R ratio of overhead line and underground cable are also considered. These impacts need to be critically assessed before any new method implementation. The thesis also introduces the new concepts of Smart Grid and Smart Meter in terms of the transition from passive to active distribution networks. The role of Smart Meter and an overview of communication technologies that could be used for voltage control are investigated. The thesis analyses the high latency of an example solution of which cost and availability are considered to demonstrate the real-time voltage control using Smart Metering with existing communication infrastructures cannot be achieved cost-effectively. The thesis provides an advanced compensation-based OLTC voltage control algorithm using Automatic Compensation Voltage Control (ACVC) technique to improve the voltage control performance with DG penetration without communication. The proposed algorithm is simulated under varying load and DG conditions based on Simulink MATLAB to show the robustness of the proposed method. A generic 11kV network in the UK is modelled to evaluate the correct control performance of the advanced voltage control algorithm while increasing the DG capacity.
|
7 |
Uma proposta de algoritmo memético baseado em conhecimento para o problema de predição de estruturas 3-D de proteínasCorrea, Leonardo de Lima January 2017 (has links)
Algoritmos meméticos são meta-heurísticas evolutivas voltadas intrinsecamente à exploração e incorporação de conhecimentos relacionados ao problema em estudo. Nesta dissertação, foi proposto um algoritmo memético multi populacional baseado em conhecimento para lidar com o problema de predição de estruturas tridimensionais de proteínas voltado à modelagem de estruturas livres de similaridades conformacionais com estruturas de proteínas determinadas experimentalmente. O algoritmo em questão, foi estruturado em duas etapas principais de processamento: (i) amostragem e inicialização de soluções; e (ii) otimização dos modelos estruturais provenientes da etapa anterior. A etapa I objetiva a geração e classificação de diversas soluções, a partir da estratégia Lista de Probabilidades Angulares, buscando a definição de diferentes grupos estruturais e a criação de melhores estruturas a serem incorporadas à meta-heurística como soluções iniciais das multi populações. A segunda etapa consiste no processo de otimização das estruturas oriundas da etapa I, realizado por meio da aplicação do algoritmo memético de otimização, o qual é fundamentado na organização da população de indivíduos em uma estrutura em árvore, onde cada nodo pode ser interpretado como uma subpopulação independente, que ao longo do processo interage com outros nodos por meio de operações de busca global voltadas a características do problema, visando o compartilhamento de informações, a diversificação da população de indivíduos, e a exploração mais eficaz do espaço de busca multimodal do problema O algoritmo engloba ainda uma implementação do algoritmo colônia artificial de abelhas, com o propósito de ser utilizado como uma técnica de busca local a ser aplicada em cada nodo da árvore. O algoritmo proposto foi testado em um conjunto de 24 sequências de aminoácidos, assim como comparado a dois métodos de referência na área de predição de estruturas tridimensionais de proteínas, Rosetta e QUARK. Os resultados obtidos mostraram a capacidade do método em predizer estruturas tridimensionais de proteínas com conformações similares a estruturas determinadas experimentalmente, em termos das métricas de avaliação estrutural Root-Mean-Square Deviation e Global Distance Total Score Test. Verificou-se que o algoritmo desenvolvido também foi capaz de atingir resultados comparáveis ao Rosetta e ao QUARK, sendo que em alguns casos, os superou. Corroborando assim, a eficácia do método. / Memetic algorithms are evolutionary metaheuristics intrinsically concerned with the exploiting and incorporation of all available knowledge about the problem under study. In this dissertation, we present a knowledge-based memetic algorithm to tackle the threedimensional protein structure prediction problem without the explicit use of template experimentally determined structures. The algorithm was divided into two main steps of processing: (i) sampling and initialization of the algorithm solutions; and (ii) optimization of the structural models from the previous stage. The first step aims to generate and classify several structural models for a determined target protein, by the use of the strategy Angle Probability List, aiming the definition of different structural groups and the creation of better structures to initialize the initial individuals of the memetic algorithm. The Angle Probability List takes advantage of structural knowledge stored in the Protein Data Bank in order to reduce the complexity of the conformational search space. The second step of the method consists in the optimization process of the structures generated in the first stage, through the applying of the proposed memetic algorithm, which uses a tree-structured population, where each node can be seen as an independent subpopulation that interacts with others, over global search operations, aiming at information sharing, population diversity, and better exploration of the multimodal search space of the problem The method also encompasses ad-hoc global search operators, whose objective is to increase the exploration capacity of the method turning to the characteristics of the protein structure prediction problem, combined with the Artificial Bee Colony algorithm to be used as a local search technique applied to each node of the tree. The proposed algorithm was tested on a set of 24 amino acid sequences, as well as compared with two reference methods in the protein structure prediction area, Rosetta and QUARK. The results show the ability of the method to predict three-dimensional protein structures with similar foldings to the experimentally determined protein structures, regarding the structural metrics Root-Mean-Square Deviation and Global Distance Total Score Test. We also show that our method was able to reach comparable results to Rosetta and QUARK, and in some cases, it outperformed them, corroborating the effectiveness of our proposal.
|
8 |
A Scalable Partial-Order Data Structure for Distributed-System ObservationWard, Paul January 2001 (has links)
Distributed-system observation is foundational to understanding and controlling distributed computations. Existing tools for distributed-system observation are constrained in the size of computation that they can observe by three fundamental problems. They lack scalable information collection, scalable data-structures for storing and querying the information collected, and scalable information-abstraction schemes. This dissertation addresses the second of these problems. Two core problems were identified in providing a scalable data structure. First, in spite of the existence of several distributed-system-observation tools, the requirements of such a structure were not well-defined. Rather, current tools appear to be built on the basis of events as the core data structure. Events were assigned logical timestamps, typically Fidge/Mattern, as needed to capture causality. Algorithms then took advantage of additional properties of these timestamps that are not explicit in the formal semantics. This dissertation defines the data-structure interface precisely, and goes some way toward reworking algorithms in terms of that interface. The second problem is providing an efficient, scalable implementation for the defined data structure. The key issue in solving this is to provide a scalable precedence-test operation. Current tools use the Fidge/Mattern timestamp for this. While this provides a constant-time test, it requires space per event equal to the number of processes. As the number of processes increases, the space consumption becomes sufficient to affect the precedence-test time because of caching effects. It also becomes problematic when the timestamps need to be copied between processes or written to a file. Worse, existing theory suggested that the space-consumption requirement of Fidge/Mattern timestamps was optimal. In this dissertation we present two alternate timestamp algorithms that require substantially less space than does the Fidge/Mattern algorithm.
|
9 |
A Scalable Partial-Order Data Structure for Distributed-System ObservationWard, Paul January 2001 (has links)
Distributed-system observation is foundational to understanding and controlling distributed computations. Existing tools for distributed-system observation are constrained in the size of computation that they can observe by three fundamental problems. They lack scalable information collection, scalable data-structures for storing and querying the information collected, and scalable information-abstraction schemes. This dissertation addresses the second of these problems. Two core problems were identified in providing a scalable data structure. First, in spite of the existence of several distributed-system-observation tools, the requirements of such a structure were not well-defined. Rather, current tools appear to be built on the basis of events as the core data structure. Events were assigned logical timestamps, typically Fidge/Mattern, as needed to capture causality. Algorithms then took advantage of additional properties of these timestamps that are not explicit in the formal semantics. This dissertation defines the data-structure interface precisely, and goes some way toward reworking algorithms in terms of that interface. The second problem is providing an efficient, scalable implementation for the defined data structure. The key issue in solving this is to provide a scalable precedence-test operation. Current tools use the Fidge/Mattern timestamp for this. While this provides a constant-time test, it requires space per event equal to the number of processes. As the number of processes increases, the space consumption becomes sufficient to affect the precedence-test time because of caching effects. It also becomes problematic when the timestamps need to be copied between processes or written to a file. Worse, existing theory suggested that the space-consumption requirement of Fidge/Mattern timestamps was optimal. In this dissertation we present two alternate timestamp algorithms that require substantially less space than does the Fidge/Mattern algorithm.
|
10 |
Shortest Path Queries in Very Large Spatial DatabasesZhang, Ning January 2001 (has links)
Finding the shortest paths in a graph has been studied for a long time, and there are many main memory based algorithms dealing with this problem. Among these, Dijkstra's shortest path algorithm is one of the most commonly used efficient algorithms to the non-negative graphs. Even more efficient algorithms have been developed recently for graphs with particular properties such as the weights of edges fall into a range of integer. All of the mentioned algorithms require the graph totally reside in the main memory. Howevery, for very large graphs, such as the digital maps managed by Geographic Information Systems (GIS), the requirement cannot be satisfied in most cases, so the algorithms mentioned above are not appropriate. My objective in this thesis is to design and evaluate the performance of external memory (disk-based) shortest path algorithms and data structures to solve the shortest path problem in very large digital maps. In particular the following questions are studied:What have other researchers done on the shortest path queries in very large digital maps?What could be improved on the previous works? How efficient are our new shortest paths algorithms on the digital maps, and what factors affect the efficiency? What can be done based on the algorithm? In this thesis, we give a disk-based Dijkstra's-like algorithm to answer shortest path queries based on pre-processing information. Experiments based on our Java implementation are given to show what factors affect the running time of our algorithms.
|
Page generated in 0.0467 seconds