Next: Index Up: Designing and Building Parallel Programs Previous:
12 Further Reading
References
- 1
- ACM. Resources in Parallel and Concurrent Systems. ACM Press,
1991.
- 2
- G. Adams, D. Agrawal, and H. Siegel. A survey and comparison of
fault-tolerant multistage interconnection networks. IEEE Trans.
Computs., C-20(6):14--29, 1987.
- 3
- J. Adams, W. Brainerd, J. Martin, B. Smith, and J. Wagener. The
Fortran 90 Handbook. McGraw-Hill, 1992.
- 4
- A. Aggarwal and J. S. Vitter. The input/output complexity of sorting and
related problems. Commun. ACM, 31(9):1116--1127, 1988.
- 5
- G. Agha. Actors. MIT Press, 1986.
- 6
- G. Agrawal, A. Sussman, and J. Saltz. Compiler and runtime support for
structured and block structured applications. In Proc. Supercomputing
'93, pages 578--587, 1993.
- 7
- A. Aho, J. Hopcroft, and J. Ullman. The Design and Analysis of
Computer Algorithms. Addison-Wesley, 1974.
- 8
- S. Akl. The Design and Analysis of Parallel Algorithms.
Prentice-Hall, 1989.
- 9
- S. G. Akl and K. A. Lyons. Parallel Computational Geometry.
Prentice-Hall, 1993.
- 10
- E. Albert, J. Lukas, and G. Steele. Data parallel computers and the FORALL
statement. J. Parallel and Distributed Computing, 13(2):185--192,
1991.
- 11
- G. S. Almasi and A. Gottlieb. Highly Parallel Computing.
Benjamin/Cummings, second edition, 1994.
- 12
- G. Amdahl. Validity of the single-processor approach to achieving
large-scale computing capabilities. In Proc. 1967 AFIPS Conf., volume
30, page 483. AFIPS Press, 1967.
- 13
- S. Anderson. Random number generators. SIAM Review,
32(2):221--251, 1990.
- 14
- G. R. Andrews. Concurrent Programming: Principles and Practice.
Benjamin/Cummings, 1991.
- 15
- G. R. Andrews and R. A. Olsson. The SR Programming Language:
Concurrency in Practice. Benjamin/Cummings, 1993.
- 16
- ANSI X3J3/S8.115. Fortran 90, 1990.
- 17
- S. Arvindam, V. Kumar, and V. Rao. Floorplan optimization on
multiprocessors. In Proc. 1989 Intl Conf. on Computer Design, pages
109--113. IEEE Computer Society, 1989.
- 18
- W. C. Athas and C. L. Seitz. Multicomputers: Message-passing concurrent
computers. Computer, 21(8):9--24, 1988.
- 19
- J. Auerbach, A. Goldberg, G. Goldszmidt, A. Gopal, M. Kennedy, J. Rao, and
J. Russell. Concert/C: A language for distributed programming. In Winter
1994 USENIX Conference. Usenix Association, 1994.
- 20
- A. Averbuch, E. Gabber, B. Gordissky, and Y. Medan. A parallel FFT on an
MIMD machine. Parallel Computing, 15:61--74, 1990.
- 21
- D. Bailey. FFTs in external or hierarchical memory. J.
Supercomputing, 4:23--35, 1990.
- 22
- J. Bailey. First we reshape our computers, then they reshape us: The
broader intellectual impact of parallelism. Daedalus, 121(1):67--86,
1992.
- 23
- H. E. Bal, J. G. Steiner, and A. S. Tanenbaum. Programming languages for
distributed computing systems. ACM Computing Surveys, 21(3):261--322,
1989.
- 24
- V. Bala and S. Kipnis. Process groups: A mechanism for the coordination of
and communication among processes in the Venus collective communication
library. Technical report, IBM T. J. Watson Research Center, 1992.
- 25
- V. Bala, S. Kipnis, L. Rudolph, and M. Snir. Designing efficient,
scalable, and portable collective communication libraries. Technical report,
IBM T. J. Watson Research Center, 1992. Preprint.
- 26
- P. Banerjee. Parallel Algorithms For VLSI Computer-Aided Design.
Prentice-Hall, 1994.
- 27
- U. Banerjee. Dependence Analysis for Supercomputing. Kluwer
Academic Publishers, 1988.
- 28
- S. Barnard and H. Simon. Fast multilevel implementation of recursive
spectral bisection for partitioning unstructured problems. Concurrency:
Practice and Experience, 6(2):101--117, 1994.
- 29
- J. Barton and L. Nackman. Scientific and Engineering C++.
Addison-Wesley, 1994.
- 30
- K. Batcher. Sorting networks and their applications. In Proc. 1968
AFIPS Conf., volume 32, page 307. AFIPS Press, 1968.
- 31
- BBN Advanced Computers Inc. TC-2000 Technical Product Summary,
1989.
- 32
- M. Ben-Ari. Principles of Concurrent and Distributed Programming.
Prentice-Hall, 1990.
- 33
- M. Berger and S. Bokhari. A partitioning strategy for nonuniform problems
on multiprocessors. IEEE Trans. Computs., C-36(5):570--580, 1987.
- 34
- F. Berman and L. Snyder. On mapping parallel algorithms into parallel
architectures. J. Parallel and Distributed Computing, 4(5):439--458,
1987.
- 35
- D. Bertsekas and J. Tsitsiklis. Parallel and Distributed Computation:
Numerical Methods. Prentice-Hall, 1989.
- 36
- D. P. Bertsekas, C. Ozveren, G. D. Stamoulis, P. Tseng, and J. N.
Tsitsiklis. Optimal communication algorithms for hypercubes. J. Parallel
and Distributed Computing, 11:263--275, 1991.
- 37
- G. Blelloch. Vector Models for Data-Parallel Computing. MIT
Press, 1990.
- 38
- F. Bodin, P. Beckman, D. B. Gannon, S. Narayana, and S. Yang. Distributed
pC++: Basic ideas for an object parallel language. In Proc. Supercomputing
'91, pages 273--282, 1991.
- 39
- S. Bokhari. On the mapping problem. IEEE Trans. Computs.,
C-30(3):207--214, 1981.
- 40
- G. Booch. Object-Oriented Design with Applications.
Benjamin-Cummings, 1991.
- 41
- R. Bordawekar, J. del Rosario, and A. Choudhary. Design and evaluation of
primitives for parallel I/O. In Proc. Supercomputing '93, pages
452--461. ACM, 1993.
- 42
- Z. Bozkus, A. Choudhary, G. Fox, T. Haupt, and S. Ranka. Fortran 90D/HPF
compiler for distributed memory MIMD computers: Design, implementation, and
performance results. In Proc. Supercomputing '93. IEEE Computer
Society, 1993.
- 43
- W. Brainerd, C. Goldberg, and J. Adams. Programmer's Guide to Fortran
90. McGraw-Hill, 1990.
- 44
- R. Butler and E. Lusk. Monitors, message, and clusters: The p4 parallel
programming system. Parallel Computing, 20:547--564, 1994.
- 45
- D. Callahan and K. Kennedy. Compiling programs for distributed-memory
multiprocessors. J. Supercomputing, 2:151--169, 1988.
- 46
- G. F. Carey, editor. Parallel Supercomputing: Methods, Algorithms and
Applications. Wiley, 1989.
- 47
- N. Carriero and D. Gelernter. Linda in context. Commun. ACM,
32(4):444--458, 1989.
- 48
- N. Carriero and D. Gelernter. How to Write Parallel Programs. MIT
Press, 1990.
- 49
- N. Carriero and D. Gelernter. Tuple analysis and partial evaluation
strategies in the Linda pre-compiler. In Languages and Compilers for
Parallel Computing. MIT-Press, 1990.
- 50
- R. Chandra, A. Gupta, and J. Hennessy. COOL: An object-based language for
parallel programming. Computer, 27(8):14--26, 1994.
- 51
- K. M. Chandy and I. Foster. A deterministic notation for cooperating
processes. IEEE Trans. Parallel and Distributed Syst., 1995. to
appear.
- 52
- K. M. Chandy, I. Foster, K. Kennedy, C. Koelbel, and C.-W. Tseng.
Integrated support for task and data parallelism. Intl J. Supercomputer
Applications, 8(2):80--98, 1994.
- 53
- K. M. Chandy and C. Kesselman. CC++: A declarative concurrent
object-oriented programming notation. In Research Directions in Concurrent
Object-Oriented Programming. MIT Press, 1993.
- 54
- K. M. Chandy and J. Misra. Parallel Program Design.
Addison-Wesley, 1988.
- 55
- K. M. Chandy and S. Taylor. An Introduction to Parallel
Programming. Jones and Bartlett, 1992.
- 56
- B. Chapman, P. Mehrotra, and H. Zima. Programming in Vienna Fortran.
Scientific Programming, 1(1):31--50, 1992.
- 57
- B. Chapman, P. Mehrotra, and H. Zima. Extending HPF for advanced
data-parallel applications. IEEE Parallel and Distributed Technology,
2(3):15--27, 1994.
- 58
- D. Y. Cheng. A survey of parallel programming languages and tools.
Technical Report RND-93-005, NASA Ames Research Center, Moffett Field, Calif.,
1993.
- 59
- J. Choi, J. Dongarra, and D. Walker. PUMMA: Parallel Universal Matrix
Multiplication Algorithms on distributed memory concurrent computers.
Concurrency: Practice and Experience, 6, 1994.
- 60
- A. Choudhary. Parallel I/O systems, guest editor's introduction. J.
Parallel and Distributed Computing, 17(1--2):1--3, 1993.
- 61
- S. Chowdhury. The greedy load-sharing algorithm. J. Parallel and
Distributed Computing, 9(1):93--99, 1990.
- 62
- M. Colvin, C. Janssen, R. Whiteside, and C. Tong. Parallel Direct-SCF for
large-scale calculations. Technical report, Center for Computational
Engineering, Sandia National Laboratories, Livermore, Cal., 1991.
- 63
- D. Comer. Internetworking with TCP/IP. Prentice-Hall, 1988.
- 64
- S. Cook. The classification of problems which have fast parallel
algorithms. In Proc. 1983 Intl Foundation of Computation Theory
Conf., volume 158, pages 78--93. Springer-Verlag LNCS, 1983.
- 65
- T. Cormen, C. Leiserson, and R. Rivest. Introduction to
Algorithms. MIT Press, 1990.
- 66
- B. Cox and A. Novobilski. Object-Oriented Programming: An Evolutionary
Approach. Addison-Wesley, 1991.
- 67
- D. Culler et al. LogP: Towards a realistic model of parallel computation.
In Proc. 4th Symp. Principles and Practice of Parallel Programming,
pages 1--12. ACM, 1993.
- 68
- G. Cybenko. Dynamic load balancing for distributed memory multiprocessors.
J. Parallel and Distributed Computing, 7:279--301, 1989.
- 69
- W. Dally. A VLSI Architecture for Concurrent Data Structures.
Kluwer Academic Publishers, 1987.
- 70
- W. Dally and C. L. Seitz. The torus routing chip. J. Distributed
Systems, 1(3):187--196, 1986.
- 71
- W. Dally and C. L. Seitz. Deadlock-free message routing in multiprocessor
interconnection networks. IEEE Trans. Computs., C-36(5):547--553,
1987.
- 72
- W. J. Dally et al. The message-driven processor. IEEE Micro.,
12(2):23--39, 1992.
- 73
- C. R. Das, N. Deo, and S. Prasad. Parallel graph algorithms for hypercube
computers. Parallel Computing, 13:143--158, 1990.
- 74
- C. R. Das, N. Deo, and S. Prasad. Two minimum spanning forest algorithms
on fixed-size hypercube computers. Parallel Computing, 15:179--187,
1990.
- 75
- A. L. DeCegama. The Technology of Parallel Processing: Parallel
Processing Architectures and VLSI Hardware: Volume 1. Prentice-Hall,
1989.
- 76
- J. del Rosario and A. Choudhary. High-Performance I/O for Parallel
Computers: Problems and Prospects. Computer, 27(3):59--68, 1994.
- 77
- J. W. Demmel, M. T. Heath, and H. A. van der Vorst. Parallel numerical
linear algebra. Acta Numerica, 10:111--197, 1993.
- 78
- P. M. Dew, R. A. Earnshaw, and T. R. Heywood. Parallel Processing for
Computer Vision and Display. Addison-Wesley, 1989.
- 79
- D. DeWitt and J. Gray. Parallel database systems: The future of
high-performance database systems. Commun. ACM, 35(6):85--98, 1992.
- 80
- E. W. Dijkstra. A note on two problems in connexion with graphs.
Numerische Mathematik, 1:269--271, 1959.
- 81
- E. W. Dijkstra, W. H. J. Feijen, and A. J. M. V. Gasteren. Derivation of a
termination detection algorithm for a distributed computation. Information
Processing Letters, 16(5):217--219, 1983.
- 82
- J. Dongarra, I. Duff, D. Sorensen, and H. van der Vorst. Solving
Linear Systems on Vector and Shared Memory Computers. SIAM, 1991.
- 83
- J. Dongarra, R. Pozo, and D. Walker. ScaLAPACK++: An object-oriented
linear algebra library for scalable systems. In Proc. Scalable Parallel
Libraries Conf., pages 216--223. IEEE Computer Society, 1993.
- 84
- J. Dongarra, R. van de Geign, and D. Walker. Scalability issues affecting
the design of a dense linear algebra library. J. Parallel and Distributed
Computing, 22(3):523--537, 1994.
- 85
- J. Dongarra and D. Walker. Software libraries for linear algebra
computations on high performance computers. SIAM Review, 1995. to
appear.
- 86
- J. Drake, I. Foster, J. Hack, J. Michalakes, B. Semeraro, B. Toonen, D.
Williamson, and P. Worley. PCCM2: A GCM adapted for scalable parallel
computers. In Proc. 5th Symp. on Global Change Studies, pages 91--98.
American Meteorological Society, 1994.
- 87
- R. Duncan. A survey of parallel computer architectures. Computer,
23(2):5--16, 1990.
- 88
- R. Duncan. Parallel computer architectures. In Advances in
Computers, volume 34, pages 113--152. Academic Press, 1992.
- 89
- D. L. Eager, J. Zahorjan, and E. D. Lazowska. Speedup versus efficiency in
parallel systems. IEEE Trans. Computs., C-38(3):408--423, 1989.
- 90
- Edinburgh Parallel Computing Centre, University of Edinburgh. CHIMP
Concepts, 1991.
- 91
- Edinburgh Parallel Computing Centre, University of Edinburgh. CHIMP
Version 1.0 Interface, 1992.
- 92
- M. A. Ellis and B. Stroustrup. The Annotated C++ Reference
Manual. Addison-Wesley, 1990.
- 93
- V. Faber, O. Lubeck, and A. White. Superlinear speedup of an efficient
parallel algorithm is not possible. Parallel Computing, 3:259--260,
1986.
- 94
- T. Y. Feng. A survey of interconnection networks. IEEE Computer,
14(12):12--27, 1981.
- 95
- J. Feo, D. Cann, and R. Oldehoeft. A report on the SISAL language project.
J. Parallel and Distributed Computing, 12(10):349--366, 1990.
- 96
- M. Feyereisen and R. Kendall. An efficient implementation of the
Direct-SCF algorithm on parallel computer architectures. Theoretica
Chimica Acta, 84:289--299, 1993.
- 97
- H. P. Flatt and K. Kennedy. Performance of parallel processors.
Parallel Computing, 12(1):1--20, 1989.
- 98
- R. Floyd. Algorithm 97: Shortest path. Commun. ACM, 5(6):345,
1962.
- 99
- S. Fortune and J. Wyllie. Parallelism in random access machines. In
Proc. ACM Symp. on Theory of Computing, pages 114--118. ACM, 1978.
- 100
- I. Foster. Task parallelism and high performance languages. IEEE
Parallel and Distributed Technology, 2(3):39--48, 1994.
- 101
- I. Foster, B. Avalani, A. Choudhary, and M. Xu. A compilation system that
integrates High Performance Fortran and Fortran M. In Proc. 1994 Scalable
High-Performance Computing Conf., pages 293--300. IEEE Computer Society,
1994.
- 102
- I. Foster and K. M. Chandy. Fortran M: A language for modular parallel
programming. J. Parallel and Distributed Computing, 25(1), 1995.
- 103
- I. Foster, M. Henderson, and R. Stevens. Data systems for parallel climate
models. Technical Report ANL/MCS-TM-169, Mathematics and Computer Science
Division, Argonne National Laboratory, Argonne, Ill., 1991.
- 104
- I. Foster, C. Kesselman, and S. Taylor. Concurrency: Simple concepts and
powerful tools. Computer J., 33(6):501--507, 1990.
- 105
- I. Foster, R. Olson, and S. Tuecke. Productive parallel programming: The
PCN approach. Scientific Programming, 1(1):51--66, 1992.
- 106
- I. Foster, R. Olson, and S. Tuecke. Programming in Fortran M. Technical
Report ANL-93/26, Mathematics and Computer Science Division, Argonne National
Laboratory, Argonne, Ill., 1993.
- 107
- I. Foster and S. Taylor. Strand: New Concepts in Parallel
Programming. Prentice-Hall, 1989.
- 108
- I. Foster, J. Tilson, A. Wagner, R. Shepard, R. Harrison, R. Kendall, and
R. Littlefield. High performance computational chemistry: (I) Scalable Fock
matrix construction algorithms. Preprint, Mathematics and Computer Science
Division, Argonne National Laboratory, Argonne, Ill., 1994.
- 109
- I. Foster and B. Toonen. Load-balancing algorithms for climate models. In
Proc. 1994 Scalable High-Performance Computing Conf., pages 674--681.
IEEE Computer Society, 1994.
- 110
- I. Foster and P. Worley. Parallel algorithms for the spectral transform
method. Preprint MCS-P426-0494, Mathematics and Computer Science Division,
Argonne National Laboratory, Argonne, Ill., 1994.
- 111
- G. Fox et al. Solving Problems on Concurrent Processors.
Prentice-Hall, 1988.
- 112
- G. Fox, S. Hiranandani, K. Kennedy, C. Koelbel, U. Kremer, C. Tseng, and
M. Wu. Fortran D language specification. Technical Report TR90-141, Dept. of
Computer Science, Rice University, 1990.
- 113
- G. Fox, R. Williams, and P. Messina. Parallel Computing Works!
Morgan Kaufman, 1994.
- 114
- P. Frederickson, R. Hiromoto, T. Jordan, B. Smith, and T. Warnock.
Pseudo-random trees in Monte Carlo. Parallel Computing, 1:175--180,
1984.
- 115
- H. J. Fromm, U. Hercksen, U. Herzog, K. H. John, R. Klar, and W.
Kleinoder. Experiences with performance measurement and modeling of a
processor array. IEEE Trans. Computs., C-32(1):15--31, 1983.
- 116
- K. Gallivan, R. Plemmons, and A. Sameh. Parallel algorithms for dense
linear algebra computations. SIAM Review, 32(1):54--135, 1990.
- 117
- N. Gehani and W. Roome. The Concurrent C Programming Language.
Silicon Press, 1988.
- 118
- G. A. Geist, M. T. Heath, B. W. Peyton, and P. H. Worley. A user's guide
to PICL: A portable instrumented communication library. Technical Report
TM-11616, Oak Ridge National Laboratory, 1990.
- 119
- A. Gibbons and W. Rytter. Efficient Parallel Algorithms.
Cambridge University Press, 1990.
- 120
- G. A. Gibson. Redundant Disk Arrays: Reliable, Parallel Secondary
Storage. MIT Press, 1992.
- 121
- H. Goldstine and J. von Neumann. On the principles of large-scale
computing machines. In Collected Works of John von Neumann, Vol. 5.
Pergamon, 1963.
- 122
- G. H. Golub and J. M. Ortega. Scientific Computing: An Introduction
with Parallel Computing. Academic Press, 1993.
- 123
- A. Gottlieb, R. Grishman, C. P. Kruskal, K. P. McAuliffe, L. Rudolph, and
M. Snir. The NYU ultracomputer: Designing a MIMD, shared memory parallel
computer. IEEE Trans. Computs., C-32(2):175--189, 1983.
- 124
- S. Graham, P. Kessler, and M. McKusick. gprof: A call graph execution
profiler. In Proc. SIGPLAN '92 Symposium on Compiler Construction,
pages 120--126. ACM, 1982.
- 125
- A. S. Grimshaw. An introduction to parallel object-oriented programming
with Mentat. Technical Report 91 07, University of Virginia, 1991.
- 126
- W. Gropp, E. Lusk, and A. Skjellum. Using MPI: Portable Parallel
Programming with the Message Passing Interface. MIT Press, 1995.
- 127
- W. Gropp and B. Smith. Scalable, extensible, and portable numerical
libraries. In Proc. Scalable Parallel Libraries Conf., pages 87--93.
IEEE Computer Society, 1993.
- 128
- A. Gupta. Parallelism in Production Systems. Morgan Kaufmann,
1987.
- 129
- J. L. Gustafson. Reevaluating Amdahl's law. Commun. ACM,
31(5):532--533, 1988.
- 130
- J. L. Gustafson, G. R. Montry, and R. E. Benner. Development of parallel
methods for a 1024-processor hypercube. SIAM J. Sci. and Stat.
Computing, 9(4):609--638, 1988.
- 131
- A. Hac. Load balancing in distributed systems: A summary. Performance
Evaluation Review, 16(2):17--19, 1989.
- 132
- G. Haring and G. Kotsis, editors. Performance Measurement and
Visualization of Parallel Systems. Elsevier Science Publishers, 1993.
- 133
- P. Harrison. Analytic models for multistage interconnection networks.
J. Parallel and Distributed Computing, 12(4):357--369, 1991.
- 134
- P. Harrison and N. M. Patel. The representation of multistage
interconnection networks in queuing models of parallel systems. J.
ACM, 37(4):863--898, 1990.
- 135
- R. Harrison et al. High performance computational chemistry: (II) A
scalable SCF code. Preprint, Mathematics and Computer Science Division,
Argonne National Laboratory, Argonne, Ill., 1994.
- 136
- P. Hatcher and M. Quinn. Data-Parallel Programming on MIMD
Computers. MIT Press, 1991.
- 137
- P. Hatcher, M. Quinn, et al. Data-parallel programming on MIMD computers.
IEEE Trans. Parallel and Distributed Syst., 2(3):377--383, 1991.
- 138
- M. Heath. Recent developments and case studies in performance
visualization using ParaGraph. In Performance Measurement and
Visualization of Parallel Systems, pages 175--200. Elsevier Science
Publishers, 1993.
- 139
- M. Heath and J. Etheridge. Visualizing the performance of parallel
programs. IEEE Software, 8(5):29--39, 1991.
- 140
- M. Heath, E. Ng, and B. Peyton. Parallel algorithms for sparse linear
systems. SIAM Review, 33(3):420--460, 1991.
- 141
- M. Heath, A. Rosenberg, and B. Smith. The physical mapping problem for
parallel architectures. J. ACM, 35(3):603--634, 1988.
- 142
- W. Hehre, L. Radom, P. Schleyer, and J. Pople. Ab Initio Molecular
Orbital Theory. John Wiley and Sons, 1986.
- 143
- R. Hempel. The ANL/GMD macros (PARMACS) in Fortran for portable parallel
programming using the message passing programming model -- users' guide and
reference manual. Technical report, GMD, Postfach 1316, D-5205 Sankt Augustin
1, Germany, 1991.
- 144
- R. Hempel, H.-C. Hoppe, and A. Supalov. PARMACS 6.0 library interface
specification. Technical report, GMD, Postfach 1316, D-5205 Sankt Augustin 1,
Germany, 1992.
- 145
- M. Henderson, B. Nickless, and R. Stevens. A scalable high-performance I/O
system. In Proc. 1994 Scalable High-Performance Computing Conf.,
pages 79--86. IEEE Computer Society, 1994.
- 146
- P. Henderson. Functional Programming. Prentice-Hall, 1980.
- 147
- J. Hennessy and N. Joupp. Computer technology and architecture: An
evolving interaction. Computer, 24(9):18--29, 1991.
- 148
- V. Herrarte and E. Lusk. Studying parallel program behavior with
upshot. Technical Report ANL-91/15, Mathematics and Computer Science
Division, Argonne National Laboratory, Argonne, Ill., 1991.
- 149
- High Performance Fortran Forum. High Performance Fortran language
specification, version 1.0. Technical Report CRPC-TR92225, Center for Research
on Parallel Computation, Rice University, Houston, Tex., 1993.
- 150
- W. D. Hillis. The Connection Machine. MIT Press, 1985.
- 151
- W. D. Hillis and G. L. Steele. Data parallel algorithms. Commun.
ACM, 29(12):1170--1183, 1986.
- 152
- S. Hiranandani, K. Kennedy, and C. Tseng. Compiling Fortran D for MIMD
distributed-memory machines. Commun. ACM, 35(8):66--80, 1992.
- 153
- C. A. R. Hoare. Quicksort. Computer J., 5(1):10--15, 1962.
- 154
- C. A. R. Hoare. Communicating Sequential Processes. Prentice
Hall, 1984.
- 155
- G. Hoffmann and T. Kauranne, editors. Parallel Supercomputing in the
Atmospheric Sciences. World Scientific, 1993.
- 156
- K. Hwang. Advanced Computer Architecture: Parallelism, Scalability,
Programmability. McGraw-Hill, 1993.
- 157
- J. JáJá. An Introduction to Parallel Algorithms. Addison-Wesley,
1992.
- 158
- J. Jenq and S. Sahni. All pairs shortest paths on a hypercube
multiprocessor. In Proc. 1987 Intl. Conf. on Parallel Processing,
pages 713--716, 1987.
- 159
- S. L. Johnsson. Communication efficient basic linear algebra computations
on hypercube architectures. J. Parallel and Distributed Computing,
4(2):133--172, 1987.
- 160
- S. L. Johnsson and C.-T. Ho. Optimum broadcasting and personalized
communication in hypercubes. IEEE Trans. Computs.,
C-38(9):1249--1268, 1989.
- 161
- M. Jones and P. Plassmann. Parallel algorithms for the adaptive refinement
and partitioning of unstructured meshes. In Proc. 1994 Scalable
High-Performance Computing Conf., pages 478--485. IEEE Computer Society,
1994.
- 162
- R. Kahn. Resource-sharing computer communication networks. Proc.
IEEE, 60(11):1397--1407, 1972.
- 163
- M. Kalos. The Basics of Monte Carlo Methods. J. Wiley and Sons,
1985.
- 164
- L. N. Kanal and V. Kumar. Search in Artificial Intelligence.
Springer-Verlag, 1988.
- 165
- A. Karp and R. Babb. A comparison of twelve parallel Fortran dialects.
IEEE Software, 5(5):52--67, 1988.
- 166
- A. H. Karp. Programming for parallelism. IEEE Computer,
20(9):43--57, 1987.
- 167
- A. H. Karp and H. P. Flatt. Measuring parallel processor performance.
Commun. ACM, 33(5):539--543, 1990.
- 168
- R. Katz, G. Gibson, and D. Patterson. Disk system architectures for high
performance computing. Proc. IEEE, 77(12):1842--1858, 1989.
- 169
- W. J. Kaufmann and L. L. Smarr. Supercomputing and the Transformation
of Science. Scientific American Library, 1993.
- 170
- B. Kernighan and D. Ritchie. The C Programming Language. Prentice
Hall, second edition, 1988.
- 171
- J. Kerrigan. Migrating to Fortran 90. O'Reilly and Associates,
1992.
- 172
- C. Kesselman. Integrating Performance Analysis with Performance
Improvement in Parallel Programs. PhD thesis, UCLA, 1991.
- 173
- L. Kleinrock. On the modeling and analysis of computer networks. Proc.
IEEE, 81(8):1179--1191, 1993.
- 174
- D. Knuth. The Art of Computer Programming: Volume 3, Sorting and
Searching. Addison-Wesley, 1973.
- 175
- D. Knuth. The Art of Computer Programming: Volume 2, Seminumerical
Algorithms. Addison-Wesley, 1981.
- 176
- C. Koelbel, D. Loveman, R. Schreiber, G. Steele, and M. Zosel. The
High Performance Fortran Handbook. MIT Press, 1994.
- 177
- S. Koonin and D. Meredith. Computational Physics. Addison-Wesley,
1990.
- 178
- J. S. Kowalik. Parallel Computation and Computers for Artificial
Intelligence. Kluwer Academic Publishers, 1988.
- 179
- V. Kumar, A. Grama, A. Gupta, and G. Karypis. Introduction to Parallel
Computing. Benjamin/Cummings, 1993.
- 180
- V. Kumar, A. Grama, and V. Rao. Scalable load balancing techniques for
parallel computers. J. Parallel and Distributed Computing,
22(1):60--79, 1994.
- 181
- V. Kumar and V. Rao. Parallel depth-first search, part II: Analysis.
Intl J. of Parallel Programming, 16(6):479--499, 1987.
- 182
- V. Kumar and V. Singh. Scalability of parallel algorithms for the
all-pairs shortest-path problem. J. Parallel and Distributed
Computing, 13(2):124--138, 1991.
- 183
- T. Lai and S. Sahni. Anomalies in parallel branch-and-bound algorithms.
Commun. ACM, 27(6):594--602, 1984.
- 184
- S. Lakshmivarahan and S. K. Dhall. Analysis and Design of Parallel
Algorithms: Arithmetic and Matrix Problems. McGraw-Hill, 1990.
- 185
- L. Lamport. Time, clocks, and the ordering of events in a distributed
system. Commun. ACM, 21(7):558--565, 1978.
- 186
- H. Lawson. Parallel Processing in Industrial Real-time
Applications. Prentice Hall, 1992.
- 187
- F. T. Leighton. Introduction to Parallel Algorithms and
Architectures. Morgan Kaufmann, 1992.
- 188
- M. Lemke and D. Quinlan. P++, a parallel C++ array class library for
architecture-independent development of structured grid applications. In
Proc. Workshop on Languages, Compilers, and Runtime Environments for
Distributed Memory Computers. ACM, 1992.
- 189
- E. Levin. Grand challenges in computational science. Commun. ACM,
32(12):1456--1457, 1989.
- 190
- F. C. H. Lin and R. M. Keller. The gradient model load balancing method.
IEEE Trans. Software Eng., SE-13(1):32--38, 1987.
- 191
- V. Lo. Heuristic algorithms for task assignment in distributed systems.
IEEE Trans. Computs., C-37(11):1384--1397, 1988.
- 192
- C. Loan. Computational Frameworks for the Fast Fourier Transform.
SIAM, 1992.
- 193
- D. Loveman. High Performance Fortran. IEEE Parallel and Distributed
Technology, 1(1):25--42, 1993.
- 194
- E. Lusk, R. Overbeek, et al. Portable Programs for Parallel
Processors. Holt, Rinehard, and Winston, 1987.
- 195
- U. Manber. On maintaining dynamic information in a concurrent environment.
SIAM J. Computing, 15(4):1130--1142, 1986.
- 196
- O. McBryan. An overview of message passing environments. Parallel
Computing, 20(4):417--444, 1994.
- 197
- O. A. McBryan and E. F. V. de Velde. Hypercube algorithms and
implementations. SIAM J. Sci. and Stat. Computing, 8(2):227--287,
1987.
- 198
- S. McConnell. Code Complete: A Practical Handbook of Software
Construction. Microsoft Press, 1993.
- 199
- C. Mead and L. Conway. Introduction to VLSI Systems.
Addison-Wesley, 1980.
- 200
- P. Mehrotra and J. Van Rosendale. Programming distributed memory
architectures using Kali. In Advances in Languages and Compilers for
Parallel Computing. MIT Press, 1991.
- 201
- J. D. Meindl. Chips for advanced computing. Scientific American,
257(4):78--88, 1987.
- 202
- Message Passing Interface Forum. Document for a standard message-passing
interface. Technical report, University of Tennessee, Knoxville, Tenn., 1993.
- 203
- Message Passing Interface Forum. MPI: A message passing interface. In
Proc. Supercomputing '93, pages 878--883. IEEE Computer Society,
1993.
- 204
- M. Metcalf and J. Reid. Fortran 90 Explained. Oxford Science
Publications, 1990.
- 205
- R. Metcalfe and D. Boggs. Ethernet: Distributed packet switching for local
area networks. Commun. ACM, 19(7):711--719, 1976.
- 206
- J. Michalakes. Analysis of workload and load balancing issues in the NCAR
community climate model. Technical Report ANL/MCS-TM-144, Mathematics and
Computer Science Division, Argonne National Laboratory, Argonne, Ill., 1991.
- 207
- B. Miller et al. IPS-2: The second generation of a parallel program
measurement system. IEEE Trans. Parallel and Distributed Syst.,
1(2):206--217, 1990.
- 208
- E. Miller and R. Katz. Input/output behavior of supercomputing
applications. In Proc. Supercomputing '91, pages 567--576. ACM, 1991.
- 209
- R. Miller and Q. F. Stout. Parallel Algorithms for Regular
Architectures. MIT Press, 1992.
- 210
- R. Milner. Calculi for synchrony and asynchrony. Theoretical Computer
Science, 25:267--310, 1983.
- 211
- nCUBE Corporation. nCUBE 2 Programmers Guide, r2.0, 1990.
- 212
- nCUBE Corporation. nCUBE 6400 Processor Manual, 1990.
- 213
- D. M. Nicol and J. H. Saltz. An analysis of scatter decomposition.
IEEE Trans. Computs., C-39(11):1337--1345, 1990.
- 214
- N. Nilsson. Principles of Artificial Intelligence. Tioga
Publishers, 1980.
- 215
- Grand challenges: High performance computing and communications. A Report
by the Committee on Physical, Mathematical and Engineering Sciences, NSF/CISE,
1800 G Street NW, Washington, DC 20550, 1991.
- 216
- D. Nussbaum and A. Agarwal. Scalability of parallel machines. Commun.
ACM, 34(3):56--61, 1991.
- 217
- R. Paige and C. Kruskal. Parallel algorithms for shortest paths problems.
In Proc. 1989 Intl. Conf. on Parallel Processing, pages 14--19, 1989.
- 218
- C. Pancake and D. Bergmark. Do parallel languages respond to the needs of
scientific programmers? Computer, 23(12):13--23, 1990.
- 219
- Parasoft Corporation. Express Version 1.0: A Communication Environment
for Parallel Computers, 1988.
- 220
- D. Parnas. On the criteria to be used in decomposing systems into modules.
Commun. ACM, 15(12):1053--1058, 1972.
- 221
- D. Parnas. Designing software for ease of extension and contraction.
IEEE Trans. Software Eng., SE-5(2):128--138, 1979.
- 222
- D. Parnas and P. Clements. A rational design process: How and why to fake
it. IEEE Trans. Software Eng., SE-12(2):251--257, 1986.
- 223
- D. Parnas, P. Clements, and D. Weiss. The modular structure of complex
systems. IEEE Trans. Software Eng., SE-11(3):259--266, 1985.
- 224
- J. Patel. Analysis of multiprocessors with private cache memories.
IEEE Trans. Computs., C-31(4):296--304, 1982.
- 225
- J. Pearl. Heuristics---Intelligent Search Strategies for Computer
Problem Solving. Addison-Wesley, 1984.
- 226
- G. F. Pfister, W. C. Brantley, D. A. George, S. L. Harey, W. J.
Kleinfelder, K. P. McAuliffe, E. A. Melton, V. A. Norlton, and J. Weiss. The
IBM research parallel processor prototype (RP3): Introduction and
architecture. In Proc. 1985 Intl Conf. on Parallel Processing, pages
764--771, 1985.
- 227
- P. Pierce. The NX/2 operating system. In Proc. 3rd Conf. on Hypercube
Concurrent Computers and Applications, pages 384--390. ACM Press, 1988.
- 228
- J. Plank and K. Li. Performance results of ickp---A consistent
checkpointer on the iPSC/860. In Proc. 1994 Scalable High-Performance
Computing Conf., pages 686--693. IEEE Computer Society, 1994.
- 229
- J. Pool et al. Survey of I/O intensive applications. Technical Report
CCSF-38, CCSF, California Institute of Technology, 1994.
- 230
- A. Pothen, H. Simon, and K. Liou. Partitioning sparse matrices with
eigenvectors of graphs. SIAM J. Mat. Anal. Appl., 11(3):430--452,
1990.
- 231
- D. Pountain. A Tutorial Introduction to OCCAM Programming. INMOS
Corporation, 1986.
- 232
- A research and development strategy for high performance computing. Office
of Science and Technology Policy, Executive Office of the President, 1987.
- 233
- The federal high performance computing program. Office of Science and
Technology Policy, Executive Office of the President, 1989.
- 234
- M. Quinn. Analysis and implementation of branch-and-bound algorithms on a
hypercube multicomputer. IEEE Trans. Computs., C-39(3):384--387,
1990.
- 235
- M. Quinn. Parallel Computing: Theory and Practice. McGraw-Hill,
1994.
- 236
- M. Quinn and N. Deo. Parallel graph algorithms. Computing
Surveys, 16(3):319--348, 1984.
- 237
- M. Quinn and N. Deo. An upper bound for the speedup of parallel best-bound
branch-and-bound algorithms. BIT, 26(1):35--43, 1986.
- 238
- S. Ranka and S. Sahni. Hypercube Algorithms for Image Processing and
Pattern Recognition. Springer-Verlag, 1990.
- 239
- V. Rao and V. Kumar. Parallel depth-first search, part I: Implementation.
Intl. J. of Parallel Programming, 16(6):501--519, 1987.
- 240
- D. A. Reed. Experimental Performance Analysis of Parallel Systems:
Techniques and Open Problems. In Proc. 7th Intl Conf. on Modeling
Techniques and Tools for Computer Performance Evaluation, 1994.
- 241
- D. A. Reed, R. A. Aydt, R. J. Noe, P. C. Roth, K. A. Shields, B. W.
Schwartz, and L. F. Tavera. Scalable Performance Analysis: The Pablo
Performance Analysis Environment. In Proc. Scalable Parallel Libraries
Conf., pages 104--113. IEEE Computer Society, 1993.
- 242
- D. A. Reed and R. M. Fujimoto. Multicomputer Networks: Message-Based
Parallel Processing. MIT Press, 1989.
- 243
- A. Reinefeld and V. Schnecke. Work-load balancing in highly parallel
depth-first search. In Proc. 1994 Scalable High-Performance Computing
Conf., pages 773--780. IEEE Computer Society, 1994.
- 244
- B. Ries, R. Anderson, W. Auld, D. Breazeal, K. Callaghan, E. Richards, and
W. Smith. The Paragon performance monitoring environment. In Proc.
Supercomputing '93, pages 850--859. IEEE Computer Society, 1993.
- 245
- A. Rogers and K. Pingali. Process decomposition through locality of
reference. In Proc. SIGPLAN '89 Conf. on Program Language Design and
Implementation. ACM, 1989.
- 246
- K. Rokusawa, N. Ichiyoshi, T. Chikayama, and H. Nakashima. An efficient
termination detection and abortion algorithm for distributed processing
systems. In Proc. 1988 Intl. Conf. on Parallel Processing: Vol. I,
pages 18--22, 1988.
- 247
- M. Rosing, R. B. Schnabel, and R. P. Weaver. The DINO parallel programming
language. Technical Report CU-CS-501-90, Computer Science Department,
University of Colorado at Boulder, Boulder, Col., 1990.
- 248
- Y. Saad and M. H. Schultz. Topological properties of hypercubes. IEEE
Trans. Computs., C-37:867--872, 1988.
- 249
- Y. Saad and M. H. Schultz. Data communication in hypercubes. J.
Parallel and Distributed Computing, 6:115--135, 1989.
- 250
- P. Sadayappan and F. Ercal. Nearest-neighbor mapping of finite element
graphs onto processor meshes. IEEE Trans. Computs.,
C-36(12):1408--1424, 1987.
- 251
- J. Saltz, H. Berryman, and J. Wu. Multiprocessors and runtime compilation.
Concurrency: Practice and Experience, 3(6):573--592, 1991.
- 252
- J. Schwartz. Ultracomputers. ACM Trans. Program. Lang. Syst.,
2(4):484--521, 1980.
- 253
- C. L. Seitz. Concurrent VLSI architectures. IEEE Trans. Computs.,
C-33(12):1247--1265, 1984.
- 254
- C. L. Seitz. The cosmic cube. Commun. ACM, 28(1):22--33, 1985.
- 255
- C. L. Seitz. Multicomputers. In C.A.R. Hoare, editor, Developments in
Concurrency and Communication. Addison-Wesley, 1991.
- 256
- M. S. Shephard and M. K. Georges. Automatic three-dimensional mesh
generation by the finite octree technique. Int. J. Num. Meth. Engng.,
32(4):709--749, 1991.
- 257
- J. Shoch, Y. Dalal, and D. Redell. Evolution of the Ethernet local
computer network. Computer, 15(8):10--27, 1982.
- 258
- H. Simon. Partitioning of unstructured problems for parallel processing.
Computing Systems in Engineering, 2(2/3):135--148, 1991.
- 259
- J. Singh, J. L. Hennessy, and A. Gupta. Scaling parallel programs for
multiprocessors: Methodology and examples. IEEE Computer,
26(7):42--50, 1993.
- 260
- M. Singhal. Deadlock detection in distributed systems. Computer,
22(11):37--48, 1989.
- 261
- P. Sivilotti and P. Carlin. A tutorial for CC++. Technical Report
CS-TR-94-02, Caltech, 1994.
- 262
- A. Skjellum. The Multicomputer Toolbox: Current and future directions. In
Proc. Scalable Parallel Libraries Conf., pages 94--103. IEEE Computer
Society, 1993.
- 263
- A. Skjellum, editor. Proc. 1993 Scalable Parallel Libraries Conf.
IEEE Computer Society, 1993.
- 264
- A. Skjellum, editor. Proc. 1994 Scalable Parallel Libraries Conf.
IEEE Computer Society, 1994.
- 265
- A. Skjellum, N. Doss, and P. Bangalore. Writing libraries in MPI. In
Proc. Scalable Parallel Libraries Conf., pages 166--173. IEEE
Computer Society, 1993.
- 266
- A. Skjellum, S. Smith, N. Doss, A. Leung, and M. Morari. The design and
evolution of Zipcode. Parallel Computing, 20:565--596, 1994.
- 267
- J. R. Smith. The Design and Analysis of Parallel Algorithms.
Oxford University Press, 1993.
- 268
- L. Snyder. Type architectures, shared memory, and the corollary of modest
potential. Ann. Rev. Comput. Sci., 1:289--317, 1986.
- 269
- H. S. Stone. High-Performance Computer Architectures.
Addison-Wesley, third edition, 1993.
- 270
- B. Stroustrup. The C++ Programming Language. Addison-Wesley,
second edition, 1991.
- 271
- C. Stunkel, D. Shea, D. Grice, P. Hochschild, and M. Tsao. The SP1
high-performance switch. In Proc. 1994 Scalable High-Performance Computing
Conf., pages 150--157. IEEE Computer Society, 1994.
- 272
- R. Suaya and G. Birtwistle, editors. VLSI and Parallel
Computation. Morgan Kaufmann, 1990.
- 273
- J. Subhlok, J. Stichnoth, D. O'Hallaron, and T. Gross. Exploiting task and
data parallelism on a multicomputer. In Proc. 4th ACM SIGPLAN Symp. on
Principles and Practice of Parallel Programming. ACM, 1993.
- 274
- X.-H. Sun and L. M. Ni. Scalable problems and memory-bounded speedup.
J. Parallel and Distributed Computing, 19(1):27--37, 1993.
- 275
- V. Sunderam. PVM: A framework for parallel distributed computing.
Concurrency: Practice and Experience, 2(4):315--339, 1990.
- 276
- Supercomputer Systems Division, Intel Corporation. Paragon XP/S
Product Overview, 1991.
- 277
- P. Swarztrauber. Multiprocessor FFTs. Parallel Computing,
5:197--210, 1987.
- 278
- D. Tabak. Advanced Multiprocessors. McGraw-Hill, 1991.
- 279
- A. Tantawi and D. Towsley. Optimal load balancing in distributed computer
systems. J. ACM, 32(2):445--465, 1985.
- 280
- R. Taylor and P. Wilson. Process-oriented language meets demands of
distributed processing. Electronics, Nov. 30, 1982.
- 281
- Thinking Machines Corporation. The CM-2 Technical Summary, 1990.
- 282
- Thinking Machines Corporation. CM Fortran Reference Manual, version
2.1, 1993.
- 283
- Thinking Machines Corporation. CMSSL for CM Fortran Reference Manual,
version 3.0, 1993.
- 284
- A. Thomasian and P. F. Bay. Analytic queuing network models for parallel
processing of task systems. IEEE Trans. Computs.,
C-35(12):1045--1054, 1986.
- 285
- E. Tufte. The Visual Display of Quantitative Information.
Graphics Press, 1983.
- 286
- J. Ullman. Computational Aspects of VLSI. Computer Science Press,
1984.
- 287
- Building an advanced climate model: Program plan for the CHAMMP climate
modeling program. U.S. Department of Energy, 1990. Available from National
Technical Information Service, U.S. Dept of Commerce, 5285 Port Royal Rd,
Springfield, VA 22161.
- 288
- L. Valiant. A bridging model for parallel computation. Commun.
ACM, 33(8):103--111, 1990.
- 289
- R. A. van de Geijn. Efficient global combine operations. In Proc. 6th
Distributed Memory Computing Conf., pages 291--294. IEEE Computer
Society, 1991.
- 290
- E. F. van de Velde. Concurrent Scientific Computing. Number 16 in
Texts in Applied Mathematics. Springer-Verlag, 1994.
- 291
- Y. Wallach. Parallel Processing and Ada. Prentice-Hall, 1991.
- 292
- W. Washington and C. Parkinson. An Introduction to Three-Dimensional
Climate Modeling. University Science Books, 1986.
- 293
- R. Williams. Performance of dynamic load balancing algorithms for
unstructured mesh calculations. Concurrency: Practice and Experience,
3(5):457--481, 1991.
- 294
- S. Wimer, I. Koren, and I. Cederbaum. Optimal aspect ratios of building
blocks in VLSI. In Proc. 25th ACM/IEEE Design Automation Conf., pages
66--72, 1988.
- 295
- N. Wirth. Program development by stepwise refinement. Commun.
ACM, 14(4):221--227, 1971.
- 296
- M. Wolfe. Optimizing Supercompilers for Supercomputers. MIT
Press, 1989.
- 297
- P. H. Worley. The effect of time constraints on scaled speedup. SIAM
J. Sci. and Stat. Computing, 11(5):838--858, 1990.
- 298
- P. H. Worley. Limits on parallelism in the numerical solution of linear
PDEs. SIAM J. Sci. and Stat. Computing, 12(1):1--35, 1991.
- 299
- J. Worlton. Characteristics of high-performance computers. In
Supercomputers: Directions in Technology and its Applications, pages
21--50. National Academy Press, 1989.
- 300
- X3J3 Subcommittee. American National Standard Programming Language
Fortran (X3.9-1978). American National Standards Institute, 1978.
- 301
- J. Yan, P. Hontalas, S. Listgarten, et al. The Automated Instrumentation
and Monitoring System (AIMS) reference manual. NASA Technical Memorandum
108795, NASA Ames Research Center, Moffett Field, Calif., 1993.
- 302
- H. Zima, H.-J. Bast, and M. Gerndt. SUPERB: A tool for semi-automatic
MIMD/SIMD parallelization. Parallel Computing, 6:1--18, 1988.
- 303
- H. Zima and B. Chapman. Supercompilers for Parallel and Vector
Computers. Addison-Wesley, 1991.
© Copyright 1995 by Ian Foster