- 上一篇:模具快速制造技术英文文献和中文翻译
- 下一篇:连续密集查询计划正式溢出策略评估英文文献和中文翻译
All the timelimits above are therefore lower bounds and, in practice, largeramounts of time may be necessary.Table I summarizes the sorters discussed so far. Examiningthe table shows that all of the above sorting designs improvesorting time over single processor sorters. Category A, B, andC sorters are limited to sort times of the order O(N) by theirsequential input and/or sequential output. Category D islimited to sort times of the order O(log2 N)-provided the datacan be input and output in parallel. Thus, increasing thenumber of elements in these sorters only increases the size ofthe list which can be sorted, it has no effect on the basic sortspeed. Furthernmore, none of these sorters can sort lists of ar-bitrary length without modifications ranging from minor tomajor in scope.Significant speed gains require a different approach, onethat either uses something besides comparators or uses themin a novel way. Replacing the comparators by distributors, forexample, replaces the factor log2 N by logb N in all the abovetime estimates. Since logb N equals (log2 N)/(log2 b), thisdecreases the sorting time by the factor 1 /(1og2 b).Two types of distributors have appeared in the literature.The radix distributor distributes the data items into one of bsublists, depending upon the value of the item base b whereevery item in one sublist preceeds every item in the next sublist.The data items are first distributed into b sublists dependingupon the leading digit, base b, of the item. Each sublist is itselffurther distributed into one of b subsublists, depending uponthe second digit, base b, of the item, and so forth. Continuingto subpide each sublist further and further leads sooner orlater to the point where each sublist contains only a single item;at this point the original list is sorted. If the items in the originallist are uniformly distributed, logb N passes suffice; if the itemsin the original list are not uniformly distributed (which seemsto be the usual case) logb MAX passes are required whereMAX is the value of the largest item in the list.To avoid the dependence upon the distribution of values ofdata items in the original list, Winslow and Chow [9] intro-duced the second type of distributor, a balanced tree or bal-anced distributor which uses a distributor containing (b - 1)comparators to distribute the data items into one of b sublists.(A more complete description is given later.) The importantpoint at the moment is that the'balanced distributor can beused as a basic serial input/serial output element in asorter.A tree of distributors can be constructed by placing onedistributor at the root of the tree, b distributors at level one ofthe tree (one distributor for each possible sublist generated bythe root distributor), and so forth, with bP distributors at levelp of the tree. A data item is inserted at the root level and passesdown the levels until it reaches the output level. The result isa sequential input/parallel output sorter with size and timecomplexities of 0((N - 1)/(b - I)) and O(N + log, N), re-spectively. The correspondences between this tree sorter andSong's tree sorter are obvious.Corresponding to Batcher's results, it is shown in [WIN 81]that a network of O(N * lgb N) distributors with sort time ofO(logb N) is possible. It is also shown in [9] that a sort time of0(1) is possible using N distributors, each of base MAX, andO(N2) switches. Comparing this result with that of [5] showsthe replacing the comparators with different types of elementsproduces a significant speed gain with approximately the samenumber of elements.
These correspondences between comparator and distribu-tor-based sorters suggest that distributor-based sorters havean inherent advantage over comparator-based sorters: they canuse a base larger than two. (To introduce a base larger thantwo into a comparator based sorter, one needs a base b com-parator or a sorter which sorts b items, which is the same thing.The fastest such sorter would be based upon [5]'s results andwould require 0(10g2 b) time units for a b-way comparisonrather than the 0(1) time units required for a two-way com-parator.)In the remainder of this paper, we present a balanced treesorter (BT) and a parallel balanced tree sorter (PBT, categoryE in Table I). These architectures incorporate the two essentialfeatures for speeding up sorting time; they are both distribu-tor-based and their speed is a direct function of the number ofelements used (that is, doubling the number of distributors halves the sorting time). They have further advantage that inboth systems the number of distributors has no effect on thesize of the list which can be sorted.III. THE BALANCED TREE SORTERIn a balanced tree, all subtrees have the same height. As-sume each node in a balanced tree contains (b - 1) values andhas b sons. An item in the list to be sorted is compared to all(b - I) values and the "nearest" value is used to determinewhich son node is used for the next comparison. As the itemdescends the tree, always using the "nearest" son for the nextcomparison node, sooner or later it reaches a node where amatch is found or it reaches the bottom of the tree. Assumingthe node values have been correctly chosen so that the treeremains balanced, when all the items in the list have been in-serted in the tree, the sorted list can be obtained from the nodevalues. Since the tree is balanced, it contains (logb N)levels.The assumption that the exact node values are known inadvance can be avoided by using a variation on Quicksort [4].Starting with the original list and choosing (b - 1) values atrandom from the list as the node values for the root of the tree,this node can be used to insert each item in the list into a bucketcorresponding to the correct son. The results of the first passare then a set df b buckets or sublists with every item in onesublist preceding every item in the next sublist; that is, the firstpass generates b sublists with a sort ordering between thesublists.Each sublist (the contents of one bucket) is now processedthe same way the original list was processed with (b - I) nodevalues chosen at random from the sublist. This generates b2sublists with a sort ordering between all the sublists. Con-tinuing this process reduces the size of the sublists at each passuntil every sublist contains a single item at which point the listis sorted. (In practice the process stops when each sublistcontains less than b items.) Note that this process assumes the(b - 1) node values are themselves sorted by some means.This is essentially Quicksort using (b - 1) comparisons ateach step rather than one. As in Quicksort, choosing nodevalues at random rather than using the "'exact" values affectsthe time less than one might assume. Frazer and McKellar [3]have shown that p(j), the probability that a sublist containsexactly j items after a pass, isp(j) = C(N-j,b-2)/C(N,b- 1)where N is the number of items in the list and C(i, j) is thecombination of i things taken ] at a time. Using combinatorialidentities, it can be shown thatj=N-b+IE jp(j) = C(N-i, b) + iC(N -i, b-1 IJ=i .,t~~~~~~~~~~~~0 and from this expression it can be snown that the mean lengthL of the sublists after a pass isL = (N-b+ 1)/b.To estimate the amount of work required to sort a list oflength N, we assume the following.1) The (b - 1) node values can be loaded into the sorter andthemselves sorted in time O(b - 1).2) Whenever a sublist contains fewer than b items, calleda terminal sublist, it can be sorted and output in time 0(2n),where n is the number of items in the sublist. Since every itemin the original list eventually ends up in a terminal sublist,assumption 2) is equivalent to requiring time O(2N) to collect,sort, and output all the terminal sublists.To determine the number of BT passes required to producethe terminal sublists, we note that the expected length, E(L),of the nonterminal sublist after one pass isN-b+I N-b+I(L)= jpj)/ E pU) j=b j=b= [N + (b -)2]/band by induction, the expected length after k passes E(Lk)isE(Lk) = [N + (b - )(bk - )]/bk.The expected length of the sublists is less than b, then,whenb > E(Lk)orork > logb (N + 1).Since each nonterminal pass requires processing N or feweritems, the time required for BT passes is less than or equal toO(N logb (N + 1)).