• Refine Query
  • Source
  • Publication year
  • to
  • Language
  • 2
  • 2
  • 1
  • Tagged with
  • 5
  • 5
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • 1
  • About
  • The Global ETD Search service is a free service for researchers to find electronic theses and dissertations. This service is provided by the Networked Digital Library of Theses and Dissertations.
    Our metadata is collected from universities around the world. If you manage a university/consortium/country archive and want to be added, details can be found on the NDLTD website.
1

Design of Various VLSI Sorting Accelerator Architectures

Fu, Chien-jung 31 August 2009 (has links)
In this thesis, various designs of VLSI sorter architectures are proposed. This thesis first presents a baseline serial sorter architecture built on a central memory module equipped with a single compare-and-swap (C&S) functional unit. A dedicated low-cost address generation circuit which controls the order of data accesses and C&S operation in order to support sorting of data sequences with any length is proposed. By exploring the bit-permutation technique to create the access orders suitable for different C&S steps, the address generator can be built by only two adders and three shifters plus some control circuits, and consumes only about 1K gates. Next, this thesis also proposes a two-bank memory architecture to reduce the required memory ports from four to two such that the sorter memory can be realized by on-chip SRAM blocks. Our experimental results show that the overall silicon cost can be reduced by more than 56% for the sorter circuit which can sort the data sequence of length up to 1024. In addition to the serial sorter architecture, this thesis further proposes three possible parallel sorter architectures including the pipeline sorter, cascade sorter, and block sorter. Among these three architectures, the pipeline sorter can deliver the best throughput although it can be used only for fixed-length data sequences. On the other hand, the block sorter is the most flexible design suitable for sequences with variable length. It is designed based on the block-level even-odd merge sort algorithm. It significantly outperforms the previous block sorter design by using more efficient algorithm, architectural pipelining, and better block C&S(BC&S) unit which can realize separate pre-sort and merge processes efficiently. Our implementation results show that by using the 0.18um technology, the core size of the proposed sorter with block-size of four is about 0.509mm2, and can sorting a 1024-point sequence within 32.84us.
2

Optimized On-chip Software Pipelining On the Cell BE Processor

Hultén, Rikard January 2010 (has links)
<p>The special architecture of the Cell BE processor has made scientists revisit the problem of sorting. This paper implements and tests a variant of merge sort where a number of 2-to-1 mergers are connected in a pipelined tree. For large trees there are many more such mergers than processors which means they must be mapped to the processors in some way. Optimized mappings are tested and results show that changing the model used when optimizing might be beneficiary. It is also shown that the small size of the local storages on the co-processors is not limiting the performance.</p>
3

Optimized On-chip Software Pipelining On the Cell BE Processor

Hultén, Rikard January 2010 (has links)
The special architecture of the Cell BE processor has made scientists revisit the problem of sorting. This paper implements and tests a variant of merge sort where a number of 2-to-1 mergers are connected in a pipelined tree. For large trees there are many more such mergers than processors which means they must be mapped to the processors in some way. Optimized mappings are tested and results show that changing the model used when optimizing might be beneficiary. It is also shown that the small size of the local storages on the co-processors is not limiting the performance.
4

Design of Iterative Cascade Sorter Architecture

Chen, Cheng-Chieh 06 September 2011 (has links)
This thesis presents a new cascaded iterative VLSI sorting architecture that can accelerate data sorting of variable-length sequences. The proposed sorter mainly consists of a central data memory block, a core comparison unit, and a special address generation module. Many fast sorting algorithms can be represented by a network of compare-and-swap (C&S) operations which can be divided into several processing steps. Instead of using parallel C&S functional units to perform C&S operations of the same sorting step, our comparison unit is composed of cascaded C&S units connected through data commutator such that different sorting steps can be processed simultaneously. The advantage of cascaded architecture is that the number of data memory accesses can be reduced by a factor equal to the number of cascaded stages. However, how to reduce the overhead of the data commutator becomes the most critical design issue. This thesis has explored the feature of C&S operation order in Bitonic sorting such that much simpler and more regular data commutator module can be achieved compared with the previous cascade design derived based on Batcher sorting. Therefore, the cascade level of our sorter architecture can be more than 2. A sample of 4-level cascade sorter has been implemented in our thesis. To generate the address sequence suitable for the proposed cascaded comparison unit, this paper proposes a low-cost address generator design based on the bit-permutation technique. Although high cascade level can lead to significant reduction of memory access which can help reducing the power dissipation, the issues of low hardware utilization for short data sequences and the increasing commutator overhead cannot be neglected. Therefore, to achieve further speed-up, this paper also adopts another parallelism approach for data sorter design by utilizing block-level C&S units which can compare a block of data at the same time. The block-level C&S units can be designed based on traditional Batcher¡¦s sorting network. Based on the proposed Bitonic cascade and Batcher¡¦s block sorting approaches, very fast and low-power sorter hardware can be achieved.
5

Sorteringsalgoritmer för strömmad data : Algoritmer för sortering av spatio-temporal data i JSON-objekt / Sorting algorithms for streaming data : Algorithms for sorting spatio-temporal data in JSON objects

Apelqvist, Joakim January 2020 (has links)
Data från positioneringssystem som GPS är alltmer vanlig, men är svårhanterlig i traditionella datalagringssystem. Sådan data består av spatiala och temporala attribut och representeras i vissa fall i JSON-format. Sortering av JSON objekt sker via inbyggda sorteringsfunktioner, vilka kräver att hela JSON objektet finns avserialiserat i minnet. Om datan strömmas måste hela datamängden tas emot innan sortering kan ske. För att förebygga detta krävs att en utvecklare utvecklar metoder för sortering av strömmad data medans strömmen pågår. Den här studien identifierar tre lämpliga sorteringsalgoritmer, och jämför dessa på hur snabbt de sorterar den strömmade datan samt deras minnesanvändning. En klientapplikation och en serverapplikation jämfördes även för att se om sortering på servern genererade bättre resultat. De slutsatser som drogs av experimentets resultat var att merge sort var snabbast men använde mest minne, medans heap sort var långsammast men hade lägst minesanvändning. Klientapplikationens sorteringstider var något snabbare än serverapplikationens.

Page generated in 0.0629 seconds