# 算法代写 | CS 467/567, Assignment 4

这个作业是完成分而治之的前缀计算

CS 467/567, Assignment 4

Due on 22 April at 11:59 pm

This assignment can be solved in teams of no more than two students. Submit your solutions by

email as a document typeset to PDF. Include all your references (if any) and also cite them in the

text as appropriate.

Late submissions will incur a penalty of 20% per day late.

Problem 1: Divide and Conquer Prefix Computations

We use again divide and conquer method to perform prefix computation (with ◦ as binary operator) on a sequence S of size n. This time the running time will be O(log log n) time using

O(n/ log log n) processors. This algorithm would be optimal, faster than the one mentioned in

the class notes, but also using more processors. To obtain such an algorithm we proceed as follows:

(1) Divide the sequence S into n

1/2

subsequences of size n

1/2

each. Perform prefix computation

on each sequence recursively using n

1/2 processors. Let the result of the computation on the

i-th sequence be hs(i, 1),s(i, 2), . . . ,s(i, n

1/2

)i.

(2) Perform a prefix computation on the sequence hs(1, n

1/2

),s(2, n

1/2

), . . . ,s(n

1/2 − 1, n

1/2

)i using n processors. Let the result of this computation be hs

0

(1, n

1/2

),s

0

(2, n

1/2

), . . . ,s

0

(n

1/2 −

1, n

1/2

)i.

(3) For all 1 ≤ i < n

1/2 use ◦ to combine s

0

(i, n

1/2

) with all the elements of hs(i + 1, 1),s(i +

1, 2), . . . ,s(i + 1, n

1/2

)i using n

1/2 processors per subsequence.

1. Show that the algorithm described above uses n processors to run in O(log log n) time.

2. Show how the number of processors can be reduced to O(n/ log log n) while maintaining

the O(log log n) running time, thus achieving the optimal cost O(n).

3. Discuss the implications of the result obtained in Question 2 on related computations such

as array packing.

Problem 2: Descending in a Hypercube

Let N = 2

g data be stored in the processors of a hypercube, one value per processor (so that the

value xi

is stored in Pi

, 0 ≤ i < N). A technique with wide applicability to hypercube algorithms

is called DESCEND and consists of g iterations. During the j-th iteration a basic binary operation

OPERATION(Pi

, Pl) is performed on data whose indices differ by 2j

. Here is the pseudocode for

this process, with ig−1ig−2 . . . i1i0 the binary representation of index i:

Algorithm DESCEND:

for j = g − 1 down to 0 do

for i = 0 to 2

g − 1 do in parallel

if ij = 0 then OPERATION(Pi

, Pi+2

j)

If OPERATION requires constant time then DESCEND runs in O(log N) time. A dual to DESCEND

is ASCEND, in which j in the outer loop varies from 0 to g − 1.

1. Show how DESCEND can be used to obtain an algorithm that computes the sum of all the

values xi and places the result in P0.

2. Use either ASCEND or DESCEND to broadcast the datum held by some processor Pk

to all the

other processors.

3. Suggest other problems that can be solved using these paradigms.

2