Spelling suggestions: "subject:"B tre""
1 |
VIF - uma estrutura de índice invertido em blocos baseada em uma B+-TreeMIRANDA, Oscar Gomes de January 2003 (has links)
Made available in DSpace on 2014-06-12T15:58:52Z (GMT). No. of bitstreams: 2
arquivo4736_1.pdf: 1906932 bytes, checksum: aa7a99e257aca29fb1c18db5712ba23e (MD5)
license.txt: 1748 bytes, checksum: 8a4605be74aa9ea9d79846c1fba20a33 (MD5)
Previous issue date: 2003 / A explosão de uso da World Wide Web (Web) e seu crescimento exponencial são
fatos reais hoje em dia. A grande quantidade de dados em formato textual disponível de
forma dispersa na Web tornou o uso de sistemas de busca bastante popular. Pesquisas
mostram que cerca de 57% de usuários da internet fazem uma consulta a cada dia. Esta
necessidade de uso tem sido a alavanca da popularidade dos sistemas de busca que, mesmo
tendo evoluído de forma significativa nos últimos anos, precisam manter-se atualizados
com estruturas capazes de indexar toda essa informação para atender esta demanda de
crescimento da Web.
Esta dissertação apresenta um levantamento de técnicas no estado-da-arte sobre estruturas
de índices para sistemas de Recuperação de Informação (RI) apresentando as
estruturas: Arquivo invertido, que é o foco principal deste trabalho; Array de sufixos.
que, mesmo oferecendo facilidades na busca em consultas por proximidade, tem um custo
de espaço de armazenamento muito alto; e Arquivo de assinaturas, que foi amplamente
utilizada em sistemas de RI na década de 80, porém foi superada pelas técnicas modernas
aplicadas a estruturas de arquivo invertido. Dentre estas técnicas cita-se a compressão
do índice através do uso de codificação Elias e Golomb os quais, além de trazer economia
de espaço, melhoram o desempenho tanto no processo de consulta quanto no processo de
construção do índice. Além disso, são descritos em detalhes métodos eficientes de acesso
e de construção e manipulação do índice.
Como resultado do trabalho é proposto o VIF - Vertical Inverted File - implementado
na prática a partir de experiência pessoal adquirida durante o trabalho realizado
no engenho de busca Radix. O VIF é uma estrutura de índice invertido organizada em
blocos baseada em uma estrutura de dados dinâmica B+-Tree que possibilita a inserção
eficiente de pequenas quantidades de documentos HTML e, também, oferece uma forma
nativa de otimização no processamento de consultas através de salto de blocos. No Radix
foram feitos testes sobre a estrutura onde obteve-se ganhos de cerca de 78% de espaço
utilizado comparado com a estrutura utilizada anteriormente. Outros testes mostraram
melhoria média de 26.5% no tempo de processamento consultas usando salto em blocos
comparado com processamento sem otimização, considerando o tempo no processamento
das consultas mais realizadas pelos usuários do sistema
|
2 |
Studying the Properties of a Distributed Decentralized b+ Tree with Weak-ConsistencyBen Hafaiedh, Khaled 18 January 2012 (has links)
Distributed computing is very popular in the field of computer science and is widely used in web applications. In such systems, tasks and resources are partitioned among several computers so that the workload can be shared among the different computers in the network, in contrast to systems using a single server computer. Distributed system designs are used for many practical reasons and are often found to be more scalable, robust and suitable for many applications.
The aim of this thesis is to study the properties of a distributed tree data-structure that allow searches, insertions and deletions of data elements. In particular, the b- tree structure [13] is considered, which is a generalization of a binary search tree. The study consists of analyzing the effect of distributing such a tree among several computers and investigates the behavior of such structure over a long period of time by growing the network of computers supporting the tree, while the state of the structure is instantly updated as insertions and deletions operations are performed. It also attempts to validate the necessary and sufficient invariants of the b-tree-structure that guarantee the correctness of the search operations.
A simulation study is also conducted to verify the validity of such distributed data-structure and the performance of the algorithm that implements it. Finally, a discussion is provided in the end of the thesis to compare the performance of the system design with other distributed tree structure designs.
|
3 |
Studying the Properties of a Distributed Decentralized b+ Tree with Weak-ConsistencyBen Hafaiedh, Khaled 18 January 2012 (has links)
Distributed computing is very popular in the field of computer science and is widely used in web applications. In such systems, tasks and resources are partitioned among several computers so that the workload can be shared among the different computers in the network, in contrast to systems using a single server computer. Distributed system designs are used for many practical reasons and are often found to be more scalable, robust and suitable for many applications.
The aim of this thesis is to study the properties of a distributed tree data-structure that allow searches, insertions and deletions of data elements. In particular, the b- tree structure [13] is considered, which is a generalization of a binary search tree. The study consists of analyzing the effect of distributing such a tree among several computers and investigates the behavior of such structure over a long period of time by growing the network of computers supporting the tree, while the state of the structure is instantly updated as insertions and deletions operations are performed. It also attempts to validate the necessary and sufficient invariants of the b-tree-structure that guarantee the correctness of the search operations.
A simulation study is also conducted to verify the validity of such distributed data-structure and the performance of the algorithm that implements it. Finally, a discussion is provided in the end of the thesis to compare the performance of the system design with other distributed tree structure designs.
|
4 |
Studying the Properties of a Distributed Decentralized b+ Tree with Weak-ConsistencyBen Hafaiedh, Khaled 18 January 2012 (has links)
Distributed computing is very popular in the field of computer science and is widely used in web applications. In such systems, tasks and resources are partitioned among several computers so that the workload can be shared among the different computers in the network, in contrast to systems using a single server computer. Distributed system designs are used for many practical reasons and are often found to be more scalable, robust and suitable for many applications.
The aim of this thesis is to study the properties of a distributed tree data-structure that allow searches, insertions and deletions of data elements. In particular, the b- tree structure [13] is considered, which is a generalization of a binary search tree. The study consists of analyzing the effect of distributing such a tree among several computers and investigates the behavior of such structure over a long period of time by growing the network of computers supporting the tree, while the state of the structure is instantly updated as insertions and deletions operations are performed. It also attempts to validate the necessary and sufficient invariants of the b-tree-structure that guarantee the correctness of the search operations.
A simulation study is also conducted to verify the validity of such distributed data-structure and the performance of the algorithm that implements it. Finally, a discussion is provided in the end of the thesis to compare the performance of the system design with other distributed tree structure designs.
|
5 |
Studying the Properties of a Distributed Decentralized b+ Tree with Weak-ConsistencyBen Hafaiedh, Khaled January 2012 (has links)
Distributed computing is very popular in the field of computer science and is widely used in web applications. In such systems, tasks and resources are partitioned among several computers so that the workload can be shared among the different computers in the network, in contrast to systems using a single server computer. Distributed system designs are used for many practical reasons and are often found to be more scalable, robust and suitable for many applications.
The aim of this thesis is to study the properties of a distributed tree data-structure that allow searches, insertions and deletions of data elements. In particular, the b- tree structure [13] is considered, which is a generalization of a binary search tree. The study consists of analyzing the effect of distributing such a tree among several computers and investigates the behavior of such structure over a long period of time by growing the network of computers supporting the tree, while the state of the structure is instantly updated as insertions and deletions operations are performed. It also attempts to validate the necessary and sufficient invariants of the b-tree-structure that guarantee the correctness of the search operations.
A simulation study is also conducted to verify the validity of such distributed data-structure and the performance of the algorithm that implements it. Finally, a discussion is provided in the end of the thesis to compare the performance of the system design with other distributed tree structure designs.
|
6 |
Attribute-Level Versioning: A Relational Mechanism for Version Storage and RetrievalBell, Charles Andrew 01 January 2005 (has links)
Data analysts today have at their disposal a seemingly endless supply of data and repositories hence, datasets from which to draw. New datasets become available daily thus making the choice of which dataset to use difficult. Furthermore, traditional data analysis has been conducted using structured data repositories such as relational database management systems (RDBMS). These systems, by their nature and design, prohibit duplication for indexed collections forcing analysts to choose one value for each of the available attributes for an item in the collection. Often analysts discover two or more datasets with information about the same entity. When combining this data and transforming it into a form that is usable in an RDBMS, analysts are forced to deconflict the collisions and choose a single value for each duplicated attribute containing differing values. This deconfliction is the source of a considerable amount of guesswork and speculation on the part of the analyst in the absence of professional intuition. One must consider what is lost by discarding those alternative values. Are there relationships between the conflicting datasets that have meaning? Is each dataset presenting a different and valid view of the entity or are the alternate values erroneous? If so, which values are erroneous? Is there a historical significance of the variances? The analysis of modern datasets requires the use of specialized algorithms and storage and retrieval mechanisms to identify, deconflict, and assimilate variances of attributes for each entity encountered. These variances, or versions of attribute values, contribute meaning to the evolution and analysis of the entity and its relationship to other entities. A new, distinct storage and retrieval mechanism will enable analysts to efficiently store, analyze, and retrieve the attribute versions without unnecessary complexity or additional alterations of the original or derived dataset schemas. This paper presents technologies and innovations that assist data analysts in discovering meaning within their data and preserving all of the original data for every entity in the RDBMS.
|
7 |
Hotlinks and DictionariesDouïeb, Karim 29 September 2008 (has links)
Knowledge has always been a decisive factor of humankind's social evolutions. Collecting the world's knowledge is one of the greatest challenges of our civilization. Knowledge involves the use of information but information is not knowledge. It is a way of acquiring and understanding information. Improving the visibility and the accessibility of information requires to organize it efficiently. This thesis focuses on this general purpose.
A fundamental objective of computer science is to store and retrieve information efficiently. This is known as the dictionary problem. A dictionary asks for a data structure which allows essentially the search operation. In general, information that is important and popular at a given time has to be accessed faster than less relevant information. This can be achieved by dynamically managing the data structure periodically such that relevant information is located closer from the search starting point. The second part of this thesis is devoted to the development and the understanding of self-adjusting dictionaries in various models of computation. In particular, we focus our attention on dictionaries which do not have any knowledge of the future accesses. Those dictionaries have to auto-adapt themselves to be competitive with dictionaries specifically tuned for a given access sequence.
This approach, which transforms the information structure, is not always feasible. Reasons can be that the structure is based on the semantic of the information such as categorization. In this context, the search procedure is linked to the structure itself and modifying the structure will affect how a search is performed. A solution developed to improve search in static structure is the hotlink assignment. It is a way to enhance a structure without altering its original design. This approach speeds up the search by creating shortcuts in the structure. The first part of this thesis is devoted to this approach.
|
8 |
Design and Implementation of Query Processing Strategies for Video DataYang, Wen-Haur 09 July 2002 (has links)
Traditional database systems only support textual and numerical data. Video data stored in these database systems can only be retrieved through their video identifiers, titles or descriptions. In the video data, frame-by-frame object change is one of the most obvious information. Each video contains temporal and spatial relationships between content objects. The temporal relationships can be specified between frame sequences and the spatial relationships can be specified by the relationships between objects in a single frame. The difficulty in designing a content-based video database system is how to store and describe the relationships between moving objects completely. Many researches on content-based video retrieval represented the content of video as a set of frames, but they either left out the temporal ordering of frames in the shot or only stored the relationships between objects in a single frame. According to these observations, we conclude that a content-based video database system requires video indexing, query processing and a convenient user interface to fit the requirements and characteristics of videos. In this thesis, we design and implement a query processing strategy for video data. In the proposed strategy, we consider three query types: the exact object match, the spatial-temporal object retrieval and the motion query, where a exact object match is to find the video files which contain the specific objects, a spatial-temporal objects retrieval is to retrieve the object pairs that satisfy some spatial-temporal relationships and a motion query is to find the set of frames which contain the object movements. Moreover, we consider three design issues: the video indexing, the video query processing and the video query interface. When there are a large number of videos in a video database and each video contains many shots, frames and objects, the processing time for content retrieval is tremendous. Thus, we need a proper video indexing strategy to speed up the searching time. In order to fulfill the spatial-temporal relationships of objects between different frames, we give the indexes both in the spatial and temporal axes. In the temporal index file structure, we propose the shot-based B+-tree to index the temporal data. In the spatial index file structure, we use R-tree to store not only the relationships between objects in one frame, but also the relationships of one object when the object first and last appears in the shot. Based on this strategy, we can describe the status of a moving object in details. For the part of query processing, we propose a signature file structure to filter out the videos that absolutely can not be the answer. After that, in order to determine whether the answer exists in the candidate videos, we use a multi-dimensional string, called binary string, to represent the spatial-temporal relationships between objects. Then, the video query processing problem will become a binary string matching problem. Finally, we design and implement an user-friendly user interface. Our system is performed on a Pentium III machine with one CPU clock rate of 550 MHz, 256 MB of main memory, running under Windows 2000 Professional edition, used Access 2000 database and coded in Delphi 6 with about 10,000 lines. From our experience, we show that the proposed system can support an efficient query processing, a fast searching capabilities and an user-friendly user interface.
|
9 |
B+ TREE CACHE MEMORY PERFORMANCEGIESKE, EDMUND J. 06 October 2004 (has links)
No description available.
|
10 |
Mokslinio žurnalo tinklalapio kūrimas / Scientific journal’s WEB site developmentIvanauskaitė, Ilma 18 June 2009 (has links)
Baigiamajame magistro darbe nagrinėjamas mokslinio žurnalo tinklalapio kūrimas. Atliktas kuriamo tinklalapio palyginimas su kitų universitetų ar institutų mokslinių žurnalų tinklalapiais. Taip pat atliktas palyginimas ir su užsienio mokslinių žurnalų tinklalapiais. Darbe apžvelgiami MySQL duomenų bazėse naudojami paieškos algoritmai. Aptariamas B medžio paieškos algoritmas, elemento įterpimas bei šalinimas iš B medžio. Taip pat aptariamas Turbo Boyer-Moore paieškos algoritmas. Pateikiami tinklalapyje atlikti patobulinimai, tokie kaip puslapis keliomis kalbomis, paieškos langas, straipsnių pateikimo dizainas ir kt. Atlikus tinklalapių palyginimą bei tinklalapio patobulinimą, pateikiamos baigiamojo darbo išvados bei pasiūlymai, tokie kaip paieška pagal reikšminius žodžius iš reikšmių sąrašo, paieškos žodžio pasiūlymas, CSS failų bei turinio valdymo sistemos panaudojimas. Apžvelgiamos PHP programavimo kalbos funkcijos skirtos rašybos klaidų taisymams. / The thesis analyse website development of scientific journal. The developed website comparison with others universities and institutes scientific journal websites was made. There was also made a comparison with foreign scientific journal websites. Thesis reviews search algorithms that are used in MySQL databases. The B tree search algorithm, insertion and elimination of element in it are described. There is also presented Turbo Boyer – Moore search algorithm. The thesis presents improvements of website, such as pages in several languages, search page, design of articles presentation and others. Conclusions and proposals, such as search by key words from the list, search word proposal, usage of CSS files and content management system are made after comparisons and improvements of website. PHP functions for spelling mistakes are described.
|
Page generated in 0.0435 seconds