Spelling suggestions: "subject:"file organization"" "subject:"pile organization""
51 |
Computational proxies : an object-based infrastructure for computational science /Cushing, Judith Bayard. January 1995 (has links)
Thesis, (Ph. D.)--Oregon Graduate Institute of Science and Technology, 1995.
|
52 |
How do experienced Information Lens users use rules?January 1988 (has links)
Wendy E. Mackay ... [et. al.]. / "October 1988." / Includes bibliographical references (p. 12).
|
53 |
Uma solução de alta disponibilidade para o sistema de arquivos distribuidos do Hadoop / A high availability solution for the Hadoop distributed file systemOriani, André, 1984- 22 August 2018 (has links)
Orientador: Islene Calciolari Garcia / Dissertação (mestrado) - Universidade Estadual de Campinas, Instituto de Computação / Made available in DSpace on 2018-08-22T22:11:10Z (GMT). No. of bitstreams: 1
Oriani_Andre_M.pdf: 3560692 bytes, checksum: 90ac96e4274dea19b7bcaec78aa959f8 (MD5)
Previous issue date: 2013 / Resumo: Projetistas de sistema geralmente optam por sistemas de arquivos baseados em cluster como solução de armazenamento para ambientes de computação de alto desempenho. A razão para isso é que eles provêm dados com confiabilidade, consistência e alta vazão. Porém a maioria desses sistemas de arquivos emprega uma arquitetura centralizada, o que compromete sua disponibilidade. Este trabalho foca especificamente em um exemplar de tais sistemas, o Hadoop Distributed File System (HDFS). O trabalho propõe um hot standby para o nó mestre do HDFS a fim de conferir-lhe alta disponibilidade. O hot standby é implementado por meio da (i) extensão da replicação de estado do mestre realizada por seu checkpoint helper, o Backup Node; e por meio da (ii) introdução de um mecanismo automático de failover. O passo (i) aproveitou-se da técnica de duplicação de mensagens desenvolvida por outra técnica de alta disponibilidade para o HDFS chamada Avatar Nodes. O passo (ii) empregou ZooKeeper, um serviço distribuído de coordenação. Essa estratégia resultou em mudanças de código pequenas, cerca de 0,18% do código original, o que faz a solução ser de fácil estudo e manutenção. Experimentos mostraram que o custo adicional imposto pela replicação não aumentou em mais de 11% o consumo médio de recursos pelos nós do sistema nem diminuiu a vazão de dados comparando-se com a versão original do HDFS. A transição completa para o hot standby pode tomar até 60 segundos quando sob cargas de trabalho dominadas por operações de E/S, mas menos de 0,4 segundos em cenários com predomínio de requisições de metadados. Estes resultados evidenciam que a solução desenvolvida nesse trabalho alcançou seus objetivos de produzir uma solução de alta disponibilidade para o HDFS com baixo custo e capaz de reagir a falhas em um breve espaço de tempo / Abstract: System designers generally adopt cluster-based file systems as the storage solution for high-performance computing environments. That happens because they provide data with reliability, consistency and high throughput. But most of those fie systems employ a centralized architecture which compromises their availability. This work focuses on a specimen of such systems, the Hadoop Distributed File System (HDFS). A hot standby for the master node of HDFS is proposed in order to bring high availability to the system. The hot standby was achieved by (i) extending the master's state replication performed by its checkpointer helper, the Backup Node; and by (ii) introducing an automatic failover mechanism. Step (i) took advantage of the message duplication technique developed by other high availability solution for HDFS named AvatarNodes. Step (ii) employed ZooKeeper, a distributed coordination service. That approach resulted on small code changes, around 0.18% of the original code, which makes the solution easy to understand and to maintain. Experiments showed that the overhead implied by replication did not increase the average resource consumption of system nodes by more than 11% nor did it diminish the data throughput compared to the original version of HDFS. The complete transition for the hot standby can take up to 60 seconds on workloads dominated by I/O operations, but less than 0.4 seconds when there is predominance of metadata requisitions. Those results show that the solution developed on this work achieved the goals of producing a high availability solution for the HDFS with low overhead and short reaction time to failures / Mestrado / Ciência da Computação / Mestre em Ciência da Computação
|
54 |
An image delta compression tool: IDeltaSullivan, Kevin Michael 01 January 2004 (has links)
The purpose of this thesis is to present a modified version of the algorithm used in the open source differencing tool zdelta, entitled "iDelta". This algorithm will manage file data and will be built specifically to difference images in the Photoshop file format.
|
55 |
Concentric Layout, A New Scientific Data Layout For Matrix Data Set In Hadoop File SystemCheng, Lu 01 January 2010 (has links)
The data generated by scientific simulation, sensor, monitor or optical telescope has increased with dramatic speed. In order to analyze the raw data speed and space efficiently, data preprocess operation is needed to achieve better performance in data analysis phase. Current research shows an increasing tread of adopting MapReduce framework for large scale data processing. However, the data access patterns which generally applied to scientific data set are not supported by current MapReduce framework directly. The gap between the requirement from analytics application and the property of MapReduce framework motivates us to provide support for these data access patterns in MapReduce framework. In our work, we studied the data access patterns in matrix files and proposed a new concentric data layout solution to facilitate matrix data access and analysis in MapReduce framework. Concentric data layout is a data layout which maintains the dimensional property in chunk level. Contrary to the continuous data layout which adopted in current Hadoop framework by default, concentric data layout stores the data from the same sub-matrix into one chunk. This matches well with the matrix operations like computation. The concentric data layout preprocesses the data beforehand, and optimizes the afterward run of MapReduce application. The experiments indicate that the concentric data layout improves the overall performance, reduces the execution time by 38% when the file size is 16 GB, also it relieves the data overhead phenomenon and increases the effective data retrieval rate by 32% on average.
|
56 |
Leyline : a provenance-based desktop search system using graphical sketchpad user interfaceGhorashi, Seyed Soroush 07 December 2011 (has links)
While there are powerful keyword search systems that index all kinds of resources
including emails and web pages, people have trouble recalling semantic facts such as
the name, location, edit dates and keywords that uniquely identifies resources in their
personal repositories. Reusing information exasperates this problem. A rarely used
approach is to leverage episodic memory of file provenance. Provenance is
traditionally defined as "the history of ownership of a valued object". In terms of
documents, we consider not only the ownership, but also the operations performed on
the document, especially those that related it to other people, events, or resources. This
thesis investigates the potential advantages of using provenance data in desktop
search, and consists of two manuscripts. First, a numerical analysis using field data
from a longitudinal study shows that provenance information can effectively be used
to identify files and resources in realistic repositories. We introduce the Leyline, the
first provenance-based search system that supports dynamic relations between files
and resources such as copy/paste, save as, file rename. The Leyline allows users to
search by drawing search queries as graphs in a sketchpad. The Leyline overlays
provenance information that may help users identify targets or explore information
flow. A limited controlled experiment showed that this approach is feasible in terms of
time and effort. Second, we explore the design of the Leyline, compare it to previous provenance-based desktop search systems, including their underlying assumptions and
focus, search coverage and flexibility, and features and limitations. / Graduation date: 2012
|
Page generated in 0.0789 seconds