摘要

  • A hardware sort-merge system which can sort large files rapidly is proposed.
  • 组成:an initial sorter and a pipelined merger.
  • In the initial sorter, record sorting is divided into two parts:
    • key-pointer sorting and record rearranging.
  • The pipelined merger is composed of several intelligent disks
    • each has a simple processor and some buffers.
  • The hardware sort-merge system can sort files of any size by using the pipelined merger repeatedly.
  • The key-pointer sorting circuit in the initial sorter
    • requires only unidirectional connections between neighboring cells,
    • instead of the usual bidirectional ones.
  • The initial sorter can also generate sorted sequences longer than its capacity so that the number of merging passes can be reduced.
  • A new data management scheme is proposed to run all merging passes in a pipelined fashion.

1 introduction

1段

  • Sorting is one of the most important operations in data processing systems.
  • Much research has been carried out in this regard [ 1, 2].
  • sort large files 仍费时间.
  • For example, major banks currently spend two hours or more every night to sort large files (on the order of several megabytes) using large computers, in order to process their demand deposit accounts 【3】.
  • 活期存款账户(Demand Deposit Accounts)
    • 是由个人、企业和政府部门持有的无息支票账户,
    • 存款人可随时支取现金或使用支票进行支付,
    • 无需事先通知银行。
    • 银行允许符合条件的客户超出其存款余额签发支票,即可进行透支。
  • 美国银行业早期也对活期存款付息,
    • 而且竞相以抬高利率的办法吸引客户,
    • 使存款成本不断上升,
    • 许多银行不得不因此从事高风险的贷款与投资活动。
    • 为避免恶性价格竞争影响银行安全,
    • 1933年美国联邦政府颁布《格拉斯-斯蒂格尔法》,
    • 其中Q条例禁止向活期存款支付利息,
    • 对使用支票的次数与最低存款余额则无法定限制。
  • It is estimated that within this decade files to be sorted will become more than ten times larger, and then each sort will take 10-15
    hours [3].
  • Thus it is necessary to develop a hardware sorting system which can sort large files more rapidly.

2段

  • In view of advances in VLSI technology, various hardware
    sorters have been proposed [4-13】.
  • However, most of them are for internal sorting only, and the size of files which can be sorted is limited by their capacity.
  • To sort large files, external sorting is a necessity.

3段

  • 这篇文章里, we propose a hardware sort-merge system, the pipelined sort-merger, which consists of an initial sorter and a pipelined merger.
  • The initial sorter is continuously fed records to be sorted
    • from a secondary memory device and
    • outputs sorted sequences consecutively to the pipelined merger,
    • which in turn merges them into a single output sequence.
  • In the initial sorter, we divide record sorting into key-pointer sorting and record rearranging.
  • The sorting operation itself is completely overlapped by the input/output of the records.
  • Furthermore, it can sort different sequences in a pipelined way.
  • More specifically, while one sorted sequence is being output, a new sequence can be input (and sorted).

4段

  • The pipelined merger is composed of several intelligent
    disks.
  • Each disk has some buffers and a simple processor.
  • All merging passes are run in a pipelined fashion and each pass is supported by a separate intelligent disk.

5段

  • The idea of pipelined merging was first proposed by Even
    [ 14]. who used tapes
    .
  • Later Todd [9] adapted it to RAM and bubble memories.
  • Here we consider disks since sorting involves only relatively simple operations, and current disk storage systems, such as the IBM 3380 and its controller, the IBM 3880, provide enough intelligence to perform sorting on them directly, freeing the CPU for other processing.

6段

  • However, to build the merger using disks involves some difficult problems,
    • such as synchronizing data transmissions
    • and avoiding latency time.
  • To resolve these, we attach m +1 two-bank buffers to each disk (for an m-way merge) andlet each bank size be the track size of the disk to avoid latency time.
  • Finally, a new data management scheme is developed to run all merging passes in a pipelined fashion.

Figure1

A hardware sort-merge system

7段

  • our key-pointer sorting circuit is simpler than similar sorting circuits
    • in that we require only two unidirectional(单向的) connections between a cell and its neighbors
    • instead of bidirectional connections.
  • Also our initial sorter is, to our knowledge, the first one which can generate sorted strings which are longer than the capacity of the sorter
    • so that the number of merging passes can be decreased.
    • In fact, it can produce, on the average, sorted sequences of 2 X n records, where n is its capacity.

8段

  • When the pipelined merger is composed of k intelligent
    • disks and an m-way merge is performed,
    • it can merge mkm^k initial sorted sequences at a time.
  • Therefore, the pipelined sort-merger can sort about 2n×mk2n\times m^k records.
    • We also show how to sort a file which contains more records
    • by using the merger repeatedly.

9段

  • In the next section, after some preliminary remarks, we describe the overall structure of the sort-merger.
  • In Section 3. we discuss the initial sorter in more detail.
  • We also present the modifications needed to generate sorted
    sequences longer than its capacity.
  • In Section 4, we are concerned with the merger.

2 overall structure

  • For convenience, we call each package of information a record.
  • Each record contains a special field called a key.
  • A set of records forms afile.
  • Sorting means to rearrange the original file so that the records are ordered by their key values.
  • We call a sorted record sequence a run.

第二段

  • let the number of records in a file be N (N is very large).
  • Also, let the size of each record and each key be L and i bytes, respectively.
  • For ease of discussion, we first assume that L and i are fixed.
  • Later, we consider the general case.
  • Files to be sorted are stored in secondary memory devices
    • and are transmitted to the hardware sort-merge system
      serially

The pipelined sort-merger

  • The pipelined sort-merger consists of the initial sorter and the pipelined merger.
  • Figure 1 is a schematic of the sort-merger.
  • The initial sorting pass is supported by the initial sorter, and each merging pass is supported by a separate intelligent disk.
  • A file to be sorted is transmitted from a secondary memory device to the sort-merger, where it is sorted, and is transmitted back to the secondary memory device.
  • In the sort-merger, data transmission for several passes is done in parallel.
  • Furthermore, input/output and sorting are overlapped, and output starts almost immediately after the sort-merger is filled.
  • Thus, the total sorting time is reduced.
  • (See Figure 2.) Note that not only are all the passes run in an overlapped fashion, but also the initial sorting time is reduced by using our initial sorter.
    A hardware sort-merge system
  • It should be pointed out that the sorter proposed in [ 13】
    • also does a merge-sort,
    • except that the sorter just akes the
    • place of the CPU and main memory in conventional merge methods;
    • it does a 2r2^r-way merge in one merging pass.
  • Since it requires 2r2^r buffers
    • (each at least the track size of a disk to avoid latency time)
    • to do a 2r2^r-way merge,
    • rr is bounded by the RAM size.
    • Thus the file to be sorted must be transmitted
    • several times between the sorter and the disks.

3 the initial sorter

  • The initial sorter is a hardware internal sorter which is
    continuously fed records to be sorted and outputs sorted
    sequences continuously to the merger.
  • If n is the capacity of the sorter, then each sorted sequence has n records.(Generation of longer sequences is discussed later.)

2段

  • In the initial sorter, although whole records are processed
    • they are not moved after each comparison.
    • Instead, we divide record sorting into key-pointer sorting and record
      rearranging.
  • The initial sorter is composed of the key-pointer sorting circuit. a RAM, and a controller.
  • Key-pointer sorting is done in the key-pointer sorting circuit,
    • and record rearranging is done according to the output of the key-pointer sorting circuit and
    • is overlapped with the output of records.

3段

  • Since the key-pointer sorting circuit processes only key-pointer pairs. it is small and simple.
  • Furthermore, as shown later.
    • it has a regular linear array structure and is therefore suitable for VLSI implementation.
  • The RAM can be easily built.
  • The controller only needs to perform very simple operations.
  • Thus the whole initial sorter is easily realizable.

4段

  • In the remainder of this section, we first discuss the
  • components of the initial sorter, then its timing, and finally
    the modification needed to generate longer sequences.

the key-pointer sorting circuit

  • The key-pointer sorting circuit is fed a sequence of n key-pointer pairs serially and outputs them serially according to the order of the key values.
  • In the sorting circuit, the sorting time is completely overlapped with the input/output time.
  • It has complete parallel operation and processes key-pointer pairs in a pipelined fashion.
  • Furthermore, it can overlap the sorting time for two consecutive input sequences.

2段

  • Its basic algorithm is a pipelined version of the odd-even
    transposition sort [ 1, 15].
    • (The odd-even transposition sort is a parallel version of the bubble sort [I].)
  • It is similar to the up-down sorter of Lee et al. [5], the zero-time sorter of
    Miranker et al. [6]. the weave sorter of Mukhopadhyay [7],
    and the RESST of Carey et al. [8], whose basic algorithms are all pipelined versions of the odd-even transposition sort.
  • In our sorting circuit, unlike the others, all inner connections are unidirectional.
  • it should also be pointed out that the systolic priority queue proposed in [12] cannot sort different sequences in a pipelined fashion as we can

3段

  • The key-pointer sorting circuit consists of a linear array of n/2 cells (we assume that n is even), each of which has two registers and a comparator (Figure 3).
  • Each cell can store two key-pointer pairs and can exchange them according to their key values.
  • There are only two unidirectional connections between a cell and its left and its right neighbor cell.

A hardware sort-merge system

4段

  • Here n key-pointer pairs are serially input to the lower register of the leftmost cell by n right-shift (input) steps
    • and serially output from the upper register of the leftmost cell by n left-shift (output) steps.
  • One right-shift (left-shift) step of the sorting circuit consists of a right-shift (left-shift) phase followed by a compare-exchange phase. (The removed pair
    at the rightmost cell goes out of the array in a right-shift step.)

5段

A hardware sort-merge system

  • Figure 4 shows an example of the sorting of the key
    sequence “5, 3. 2, 9. 1, 7” (n = 6) in ascending order.
  • (Pointers are not shown.)
  • Initially, each register contains +\infty.
  • In the right-shift phase of each input step, key-pointer pairs
    in lower registers are shifted to the right.
  • In the left-shift phase of each output step, pairs in upper registers are shifted to the left.
  • and +\infty is entered into the upper register of the
    rightmost cell.
  • In the compare-exchange phase of any step, in each cell, two keys are compared, and the pair with the smaller key value goes to the upper register.
  • At the end of operation, the circuit is filled with +\infty’s.
  • Note that at the end of any step, the pair with the smallest key value in the circuit at that time must be in the upper register of the leftmost cell
    and the second smallest must be in either the lower register of the leftmost cell or the upper register of the second leftmost cell.
  • In general, the pair with the iith smallest key value must be in one of the left i cells.

6段

  • The same principle applies to the descending sort.
  • We have only to replace 正无穷 with 负无穷 and
    • interchange “smaller” and ‘‘larger.’’
  • In order to distinguish between the ascending and the descending sort,
    • we only need a single control line.
  • In the remainder of this section, we consider the ascending
    sort only

7段

  • The sorting circuit not only processes the key-pointer pair
    of a given sequence in a pipelined fashion, but also can sort
    different sequences in a pipelined way;
  • i.e., while one sorted sequence is being output, a new sequence can be input from the other end of the circuit.
  • In order to distinguish sequences, we attach tag 0 to each key-pointer pair input from the left end and tag 1 to each pair input from the right end.
  • The tags are not compared

8段

  • Figure 5 shows ascending sorting for four sequences in a
    pipelined way. (Pointers are not shown.)
  • Whenever two pairs with different tags meet at a cell, they are exchanged (no comparison is performed).
  • When two pairs input from the right end ( 1 -tagged) meet at a cell, the smaller one goes to the lower register.
  • As we can see in the third sequence of the example, the sorting circuit still works when a sequence contains equal keys.
  • However, if we require that two pairs with equal keys be output in the same order as they are input, i.e… first in first out, we can attach a counter value to each key which indicates its input number in a sequence.
  • Of course. log2n\lceil log_2n\rceil extra bits are then needed for each register of
    the key-pointer sorting circuit. (Counter values are not shown in the figure.)
  • After input is completed, ++\infty’s are input.
    • ++\infty’s input from the left (right) end are tagged with 0(1).
  • As shown in the last sequence of the example,
    • the sorting circuit still works in a pipelined way when the length
      of the last sequence is smaller than n.
  • The output of the last sequence starts immediately after the output of the second from the last sequence is complete.
    • Thus pairs are continuously output.

9段

  • Initially, the sorting circuit must be filled with +s+\infty's
    • After several sequences have been sorted (see Step 28 of Fig. 5),
    • the sorter is filled with ++\infty’s with different tags.
    • However, to sort the next batch of sequences,
    • no reinitialization is needed
    • because all 0-tagged ( 1 -tagged) ++\infty’s must reside in the left
    • (right) part of the sorter, which is sufficient to guarantee that
    • the later sequences are correctly sorted.

10段

  • In this paper, we do not discuss the detailed logic design of the key-pointer sorting circuit.
  • Suffice it to say that it can be implemented either in a bit-serial or a bit-parallel fashion.
  • Of course, parallel operation is faster but requires more hardware.
    • Since in the initial sorter the key-pointer sorting circuit needs to perform only one step of an operation (shift and compare-exchange)
    • during each transmission of a record and since the record transmission time is much longer than the operation time, serial operation would be enough.

Tlw RAM and the controller

相关文章:

  • 2022-01-15
  • 2022-12-23
  • 2022-01-06
  • 2021-10-21
  • 2021-10-13
  • 2021-07-18
  • 2022-12-23
  • 2022-12-23
猜你喜欢
  • 2021-05-25
  • 2021-10-23
  • 2022-12-23
  • 2021-11-01
  • 2022-01-25
相关资源
相似解决方案