Next: Chapter Notes Up: 4
Putting Components Together Previous: 4.7
Summary
- Discuss ways in which modular design techniques are used in the design of
an automobile engine, house, or suspension bridge.
- Identify ways in which modularity ideas might apply to the management of
an orchestra, educational institution, or company.
- Develop analytic models for the maximum
throughput (in requests processed per second) supported by the centralized and
distributed tuple space implementations outlined in Section 4.5.
- Using a language that supports concurrent composition, such as CC++ \ or
Fortran M, implement centralized and distributed tuple space modules. Study
the performance of these modules in a simple parameter study problem, and
relate performance to the models of Exercise 3.
- Discuss ways in which the database search algorithm of Section 4.5
could be modified to avoid the central bottleneck inherent in a single
manager.
- An alternative parallel algorithm for the 2-D FFT considered in Section 4.4.1
assumes a fixed 2-D decomposition of data structures; hence, communication is
required within each 1-D FFT. Use a performance model to contrast this
algorithm with those described in the text. Assume that the communication
costs for a 1-D FFT algorithm operating on N points on P
processors are
.
- A problem comprises two components, A
and B . A can be solved in 1000 seconds on computer
C1 and in 5000 seconds on computer C2 ; B requires
4000 and 2000 seconds on C1 and C2 , respectively. The two
computers are connected by a 1000-km optical fiber link that can transfer data
at 100 MB/sec with a latency of 10 msec. The two components can execute
concurrently but must transfer 10 MB of data 10,000 times during execution. Is
it cheapest (in terms of computational resources consumed) to solve the
problem on computer C1 , on computer C2 , or on both
computers together?
- A problem similar to that of Exercise 7 is found
to use fewer computational resources when run on the two networked computers
than on either computer alone. Your public relations office proposes to
promote this as an example of superlinear speedup. Do you agree?
- A climate model consists of an atmosphere and ocean model. At each step,
the models both perform a finite difference computation and exchange five 2-D
arrays. Develop performance models for two alternative parallel
implementations, based on sequential and parallel composition respectively.
Discuss the relative performance of the two implementations.
- Determine the problem size and processor count regimes in which each of
the three convolution algorithms described in Example 4.4 would
be faster, assuming machine parameters characteristic of (a) a multicomputer
and (b) an Ethernet-connected LAN.
- The performance analysis of the pipelined algorithms considered in Section
4.4
did not take into account idle time incurred when starting up and shutting
down pipelines. Refine the performance models to account for these costs. How
many images must be processed before efficiency reaches 95 percent of that
predicted in the absence of these costs?
- Execute by hand for N=P=4 the matrix multiplication algorithm
based on a 2-D decomposition described in Section 4.6.1.
- A simpler variant of the multiplication
algorithm for matrices decomposed in two dimensions uses a ring rather than a
tree for the broadcast operations. Use performance models to compare the two
algorithm variants. (Note that a broadcast can be performed on a P
-node bidirectional ring in approximately P/2 steps.)
- Extend the analysis and comparison of Exercise 13 to
account for competition for bandwidth in a 2-D mesh architecture.
- Develop a performance model for the systolic matrix multiplication
algorithm of Section 4.6, and
use this to identify regimes in which this algorithm is faster than the
simpler algorithm based on regular 2-D decompositions. Account for data
reorganization costs.
- Another matrix multiplication algorithm collects the matrices that are to
be multiplied on a single processor. Use performance models to identify
regimes in which this algorithm might be competitive. Conduct empirical
studies to validate your analytic results.
Next: Chapter Notes Up: 4
Putting Components Together Previous: 4.7
Summary
© Copyright 1995 by Ian Foster