## Associated and Markey and Addressing M. Samuel Usochericity in the Wold's American Fernite and a second second and the second seco

# REFERENCES

BASH81 Bashe, C.; Bucholtz, W.; Hawkins, G.; Ingram, J.; and Rod ... I. quary Aschitechund of disirly Louipntetats [Elifetoguetical Co Development, Septembolic 981001, web/s sympolylon/september Development, Septembolic 981001, web/s sympolylon/september

### Abbreviations

ACM Association for Computing Machinery

IEEE Institute of Electrical and Electronics Engineers

ABBO00 Abbot, D. PCI Bus Demystified. Eagle Rock, VA: LLH Technology Publishing, 2000.

ACOS86 Acosta, R.; Kjelstrup, J.; and Torng, H. "An Instruction Issuing Approach to Enhancing Performance in Multiple Functional Unit Processors." *IEEE Transactions on Computers*, September 1986.

ADAM91 Adamek, J. Foundations of Coding. New York: Wiley, 1991.

AGAR89 Agarwal, A. Analysis of Cache Performance for Operating Systems and Multiprogramming. Boston: Kluwer Academic Publishers, 1989.

AGER87 Agerwala, T., and Cocke, J. High Performance Reduced Instruction Set Processors. Technical Report RC12434 (#55845). Yorktown, NY: IBM Thomas J. Watson Research Center, January 1987.

ALEX93 Alexandridis, N. Design of Microprocessor-Based Systems. Englewood Cliffs, NJ: Prentice Hall, 1993.

ANDE67a Anderson, D.; Sparacio, F.; and Tomasulo, F. "The IBM System/360 Model 91: Machine Philosophy and Instruction Handling." *IBM Journal of Research and Development*, January 1967.

- ANDE67b Anderson, S., et al. "The IBM System/360 Model 91: Floating-Point Execution Unit." *IBM Journal of Research and Development*, January 1967. Reprinted in [SWAR90, Volume 1].
- ANDE98 Anderson, D. FireWire System Architecture. Reading, MA: Addison-Wesley, 1998.
- ATKI96 Atkins, M. "PC Software Performance Tuning." IEEE Computer, August 1996.

AZIM92 Azimi, M.; Prasad, B.; and Bhat, K. "Two Level Cache Architectures." Proceedings COMPCON '92, February 1992.



Baentsch, M., et al. "Enhancing the Web's Infrastructure: From Caching BAEN97 to Replication." Internet Computing, March/April 1997.

- **BAIL93** Bailey, D. "RISC Microprocessors and Scientific Computing." Proceedings, Supercomputing '93, 1993.
- Bashe, C.; Bucholtz, W.; Hawkins, G.; Ingram, J.; and Rochester, N. "The BASH81 Architecture of IBM's Early Computers." IBM Journal of Research and Development, September 1981.
- **BASH91** Bashteen, A.; Lui, I.; and Mullan, J. "A Superpipeline Approach to the MIPS Architecture." Proceedings, COMPCON Spring '91, February 1991.
- Bell, C.; Cady, R.; McFarland, H.; Delagi, B.; O'Loughlin, J.; and BELL70 Noonan, R. "A New Architecture for Minicomputers-The DEC PDP-11." Proceedings, Spring Joint Computer Conference, 1970.

**BELL71a** Bell, C., and Newell, A. Computer Structures: Readings and Examples. New York: McGraw-Hill, 1971.

BELL78a Bell, C.; Mudge, J.; and McNamara, J. Computer Engineering: A DEC View of Hardware Systems Design. Bedford, MA: Digital Press, 1978.

BELL78b Bell, C.; Newell, A.; and Siewiorek, D. "Structural Levels of the PDP-8." In [BELL78a].

BELL78c Bell, C.; Kotok, A.; Hastings, T.; and Hill, R. "The Evolution of the DEC System-10." Communications of the ACM, January 1978.

BENH92 Benham, J. "A Geometric Approach to Presenting Computer Representations of Integers." SIGCSE Bulletin, December 1992.

Betker, M.; Fernando, J.; and Whalen, S. "The History of the Micro-BETK97 processor." Bell Labs Technical Journal, Autumn 1997.

Bharandwaj, J., et al. "The Intel IA-64 Compiler Code Generator." **BHAR00** IEEE Micro, September/October 2000.

BLAA97 Blaauw, G., and Brooks, F. Computer Architecture: Concepts and Evolution. Reading, MA: Addison-Wesley, 1997.

BLAH83 Blahut, R. Theory and Practice of Error Control Codes. Reading, MA: Addison-Wesley, 1983.

BOHR98 Bohr, M. "Silicon Trends and Limits for Advanced Microprocessors." Communications of the ACM, March 1998.

BRAD91a Bradlee, D.; Eggers, S.; and Henry, R. "The Effect on RISC Performance of Register Set Size and Structure Versus Code Generation Strategy." Proceedings, 18th Annual International Symposium on Computer Architecture, May 1991.

BRAD91b Bradlee, D.; Eggers, S.; and Henry, R. "Integrating Register Allocation and Instruction Scheduling for RISCs." Proceedings, Fourth International Conference on Architectural Support for Programming Languages and Operating Systems, April 1991.

**BREW97** Brewer, E. "Clustering: Multiply and Conquer." *Data Communications*, July 1997.

**BREY00** Brey, B. The Intel Microprocessors: 8086/8066, 80186/80188, 80286, 80386, 80486, Pentium, Pentium Pro and Pentium II Processors. Upper Saddle River, NJ: Prentice Hall, 2000.

**BURG97** Burger, D., and Austin, T. "The SimpleScalar Tool Set, Version 2.0." Computer Architecture News, June 1997.

**BURK46** Burks, A.; Goldstine, H.; and von Neumann, J. Preliminary Discussion of the Logical Design of an Electronic Computer Instrument. Report prepared for U.S. Army Ordnance Dept., 1946, reprinted in [BELL71a].

BUYY99a Buyya, R. High Performance Cluster Computing: Architectures and Systems. Upper Saddle River, NJ: Prentice Hall, 1999.

BUYY99b Buyya, R. High Performance Cluster Computing: Programming and Applications. Upper Saddle River, NJ: Prentice Hall, 1999.

- CART96 Carter, J. Microprocesser Architecture and Microprogramming. Upper Saddle River, NJ: Prentice Hall, 1996.
- CATA94 Catanzaro, B. Multiprocessor System Architectures. Mountain View, CA: Sunsoft Press, 1994.
- CHAI82 Chaitin, G. "Register Allocation and Spilling via Graph Coloring." Proceedings, SIGPLAN Symposium on Compiler Construction, June 1982.
- CARM00 Carmean, D. "Inside the High-Performance Intel Pentium 4 Processor Microarchitecture." *Intel Developer Forum*, Fall 2000. ftp://download.intel.com/ design/id /all2000/presentations/pda/pda\_s01\_cd.pdf.
- CHAS00 Chasin, A. "Predication, Speculation, and Modern CPUs." Dr. Dobb's Journal, May 2000.
- CHEN94 Chen, P.; Lee, E.; Gibson, G.; Katz, R.; and Patterson, D. "RAID: High-Performance, Reliable Secondary Storage." ACM Computing Surveys, June

1994.

- CHOW86 Chow, F.; Himmelstein, M.; Killian, E.; and Weber, L. "Engineering a RISC Compiler System." *Proceedings, COMPCON Spring* '86, March 1986.
- CHOW87 Chow, F.; Correll, S.; Himmelstein, M.; Killian, E.; and Weber, L. "How Many Addressing Modes Are Enough?" Proceedings, Second International Conference on Architectural Support for Programming Languages and Operating Systems, October 1987.
- CHOW90 Chow, F., and Hennessy, J. "The Priority-Based Coloring Approach to Register Allocation." ACM Transactions on Programming Languages, October 1990.
- CLAR85 Clark, D., and Emer, J. "Performance of the VAX-11/780 Translation Buffer: Simulation and Measurement." ACM Transactions on Computer Systems, February 1985.

CLEM00 Clements, A. "The Undergraduate Curriculum in Computer Architecture." IEEE Micro, May/June 2000.

**COHE81** Cohen, D. "On Holy Wars and a Plea for Peace." Computer, October 1981.

COLW85a Colwell, R.; Hitchcock, C.; Jensen, E.; Brinkley-Sprunt, H.; and Kollar, C. "Computers, Complexity, and Controversy." Computer, September 1985.

COLW85b Colwell, R.; Hitchcock, C.; Jensen, E.; and Sprunt, H. "More Controversy About 'Computers, Complexity, and Controversy.'" Computer, December 1985.

**COME95** Comerford, R. "An Overview of High Performance." *IEEE Spectrum*, April 1995.

**COME00** Comerford, R. "Magnetic Storage: The Medium that Wouldn't Die." *IEEE Spectrum*, December 2000.

**COOK82** Cook, R., and Dande, N. "An Experiment to Improve Operand Addressing." *Proceedings, Symposium on Architecture Support for Programming Languages and Operating Systems, March 1982.* 

COON81 Coonen, J. "Underflow and Denormalized Numbers." *IEEE Computer*, March 1981.

COUT86 Coutant, D.; Hammond, C.; and Kelley, J. "Compilers for the New Generation of Hewlett-Packard Computers." Proceedings, COMPCON Spring '86, March 1986.

**CRAG79** Cragon, H. "An Evaluation of Code Space Requirements and Performance of Various Architectures." *Computer Architecture News*, February 1979.

CRAG92 Cragon, H. Branch Strategy Taxonomy and Performance Models. Los Alamitos, CA: IEEE Computer Society Press, 1992.

CRAW90 Crawford, J. "The i486 CPU: Executing Instructions in One Clock Cycle." IEEE Micro, February 1990.

CRIS97 Crisp, R. "Direct RAMBUS Technology: The New Main Memory Standard." IEEE Micro, November/December 1997.

**DATT93** Dattatreya, G. "A Systematic Approach to Teaching Binary Arithmetic in a First Course." *IEEE Transactions on Education*, February 1993.

**DAVI87** Davidson, J., and Vaughan, R. "The Effect of Instruction Set Complexity on Program Size and Memory Performance." *Proceedings, Second International Conference on Architectural Support for Programming Languages and Operating Systems,* October 1987.

**DENN68** Denning, P. "The Working Set Model for Program Behavior." Communications of the ACM, May 1968.

**DEWA90** Dewar, R., and Smosna, M. Microprocessors: A Programmer's View. New York: McGraw-Hill, 1990.

**DIJK63** Dijkstra, E. "Making an ALGOL Translator for the X1." in Annual Review of Automatic Programming, Volume 4. Pergamon, 1963.

**DOET97** Doetting, G., et al. "S/390 Parallel Enterprise Server Generation 3: A Balanced System and Cache Structure." *IBM Journal of Research and Development*, July/September 1997.

**DOWD98** Dowd, K., and Severance, C. *High Performance Computing*. Sebastopol, CA: O'Reilly, 1998.

**DUBE91** Dubey, P., and Flynn, M. "Branch Strategies: Modeling and Optimization." *IEEE Transactions on Computers*, October 1991.

DULO98 Dulong, C. "The IA-64 Architecture at Work." Computer, July 1998.

ECKE90 Eckert, R. "Communication Between Computers and Peripheral Devices— An Analogy." ACM SIGCSE Bulletin, September 1990.

- ELAY85 El-Ayat, K., and Agarwal, R. "The Intel 80386—Architecture and Implementation." *IEEE Micro*, December 1985.
- **EVEN00** Even, G., and Paul, W. "On the Design of IEEE Compliant Floating-Point Units." *IEEE Transactions on Computers*, May 2000.
- **EVER98** Evers, M., et al. "An Analysis of Correlation and Predictability: What Makes Two-Level Branch Predictors Work." *Proceedings, 25th Annual International Symposium on Microarchitecture*, July 1998.
- **EVER01** Evers, M., and Yeh, T. "Understanding Branches and Designing Branch Predictors for High-Performance Microprocessors." *Proceedings of the IEEE*, November 2001.
- **FARM92** Farmwald, M., and Mooring, D. "A Fast Path to One Memory." *IEEE Spectrum*, October 1992.
- FITZ81 Fitzpatrick, D., et al. "A RISCy Approach to VLSI." VLSI Design, 4th quarter, 1981. Reprinted in Computer Architecture News, March 1982.

FLYN71 Flynn, M., and Rosin, R. "Microprogramming: An Introduction and a Viewpoint." *IEEE Transactions on Computers*, July 1971.

FLYN72 Flynn, M. "Some Computer Organizations and Their Effectiveness." IEEE Transactions on Computers, September 1972.

FLYN85 Flynn, M.; Johnson, J.; and Wakefield, S. "On Instruction Sets and Their Formats." *IEEE Transactions on Computers*, March 1985.

**FLYN87** Flynn, M.; Mitchell, C.; and Mulder, J. "And Now a Case for More Complex Instruction Sets." *Computer*, September 1987.

**FLYN01** Flynn, M., and Oberman, S. Advanced Computer Arithmetic Design. New York: Wiley, 2001.

**FRAI83** Frailey, D. "Word Length of a Computer Architecture: Definitions and Applications." *Computer Architecture News*, June 1983.

**FRIE96** Friedman, M. "RAID Keeps Going and Going and ..." *IEEE Spectrum*, April 1996.

FURH87 Furht, B., and Milutinovic, V. "A Survey of Microprocessor Architectures for Memory Management." Computer, March 1987.

FUTR01 Futral, W. InfiniBand Architecture: Development and Deployment. Hillsboro, OR: Intel Press, 2001.

GIFF87 Gifford, D., and Spector, A. "Case Study: IBM's System/360-370 Architecture." Communications of the ACM, April 1987.

**GOLD91** Goldberg, D. "What Every Computer Scientist Should Know About Floating-Point Arithmetic." ACM Computing Surveys, March 1991. Available at http://www.validgh.com/

HAND98 Handy, J. The Cache Memory Book. San Diego: Academic Press, 1993.

HALF97 Halfhill, T. "Beyond Pentium II." Byte, December 1997.

HAYE98 Hayes, J. Computer Architecture and Organization. New York: McGraw-Hill, 1998.

HEAT84 Heath, J. "Re-evaluation of RISC I." Computer Architecture News, March 1984.

HENN82 Hennessy, J., et al. "Hardware/Software Tradeoffs for Increased Performance." Proceedings, Symposium on Architectural Support for Programming Languages and Operating Systems, March 1982.

HENN84 Hennessy, J. "VLSI Processor Architecture." IEEE Transactions on Computers, December 1984.

HENN91 Hennessy, J., and Jouppi, N. "Computer Technology and Architecture: An Evolving Interaction." *Computer*, September 1991.

HENN96 Hennessy, J., and Patterson, D. Computer Architecture: A Quantitative Approach. San Mateo, CA: Morgan Kaufmann, 1996.

HIDA90 Hidaka, H.; Matsuda, Y.; Asakura, M.; and Kazuyasu, F. "The Cache DRAM Architecture: A DRAM with an On-Chip Cache Memory." *IEEE Micro*, April 1990.

HIGB90 Higbie, L. "Quick and Easy Cache Performance Analysis." Computer Architecture News, June 1990.

HILL64 Hill, R. "Stored Logic Programming and Applications." Datamation, February 1964.

HILL89 Hill, M. "Evaluating Associativity in CPU Caches." IEEE Transactions on Computers, December 1989.

HINT01 Hinton, G., et al. "The Microarchitecture of the Pentium 4 Processor." Intel Technology Journal, Q1 2001. http://developer.intel.com/technology/itj/

HUCK83 Huck, T. Comparative Analysis of Computer Architectures. Stanford University Technical Report No. 83-243, May 1983.

Huck, J., et al. "Introducing the IA-64 Architecture." IEEE Micro, Sep-HUCK00 tember/October 2000.

- HUGU91 Huguet, M., and Lang, T. "Architectural Support for Reduced Register Saving/Restoring in Single-Window Register Files." ACM Transactions on Computer Systems, February 1991.
- Hutcheson, G., and Hutcheson, J. "Technology and Economics in the HUTC96 Semiconductor Industry." Scientific American, January 1996.
- Hwang, K. Advanced Computer Architecture. New York: McGraw-Hill, HWAN93 1993.
- Hwang, K, et al. "Designing SSI Clusters with Hierarchical Check-HWAN99 pointing and Single I/O Space." IEEE Concurrency, January-March 1999.

- HWU98 Hwu, W. "Introduction to Predicated Execution." Computer, January 1998.
- Hwu, W.; August, D.; and Sias, J. "Program Decision Logic Optimiza-HWU01 tion Using Predication and Control Speculation." Proceedings of the IEEE, November 2001.
- **IBM94** International Business Machines, Inc. The PowerPC Architecture: A Specification for a New Family of RISC Processors. San Francisco, CA: Morgan Kaufmann, 1994.
- International Business Machines, Inc. 64 Mb Synchronous DRAM. IBM IBM01 Data Sheet 364164, January 2001.
- Institute of Electrical and Electronics Engineers. IEEE Standard for IEEE85 Binary Floating-Point Arithmetic. ANSI/IEEE Std 754-1985, 1985.
- Intel Corp. Pentium Pro and Pentium II Processors and Related Products. INTE98 Aurora, CO, 1998.
- Intel Corp. Intel IA-64 Architecture Software Developer's Manual (4 vol-**INTE00a** umes). Document 245317 through 245320. Aurora, CO, 2000.

- **INTE00b** Intel Corp. Itanium Processor Microarchitecture Reference for Software Optimization. Aurora, CO, Document 245473. August 2000.
- **INTE01a** Intel Corp. IA-32 Intel Architecture Software Developer's Manual (2 volumes). Document 245470 and 245471. Aurora, CO, 2001.
- Intel Corp. Intel Pentium 4 Processor Optimization Reference Manual. INTE01b Document 248966-04. Aurora, CO, 2001. http://developer.intel.com/design/ pentium4/manuals/248966.htm.
- James, D. "Multiplexed Buses: The Endian Wars Continue." IEEE JAME90 Micro, September 1983.
- JARP01 Jarp, S. "Optimizing IA-64 Performance." Dr. Dobb's Journal, July 2001.
- JOHN91 Johnson, M. Superscalar Microprocessor Design. Englewood Cliffs, NJ: Prentice Hall, 1991.

JOUP88 Jouppi, N. "Superscalar versus Superpipelined Machines." Computer Architecture News, June 1988.

JOUP89a Jouppi, N., and Wall, D. "Available Instruction-Level Parallelism for Superscalar and Superpipelined Machines." Proceedings, Third International Conference on Architectural Support for Programming Languages and Operating Systems, April 1989.

**JOUP89b** Jouppi, N. "The Nonuniform Distribution of Instruction-Level and Machine Parallelism and Its Effect on Performance." *IEEE Transactions on Computers*, December 1989.

JTF01 Joint Task Force on Computing Curricula. Computing Curricula 2001 Computer Science. IEEE Computer Society and ACM, August 2001.

KAEL91 Kaeli, D., and Emma, P. "Branch History Table Prediction of Moving Target Branches Due to Subroutine Returns." Proceedings, 18th Annual International Symposium on Computer Architecture, May 1991.

KAGA01 Kagan, M. "InfiniBand: Thinking Outside the Box Design." Communications System Design, September 2001. (www.csdmag.com)

- **KANE92** Kane, G., and Heinrich, J. *MIPS RISC Architecture*. Englewood Cliffs, NJ: Prentice Hall, 1992.
- KAPP00 Kapp, C. "Managing Cluster Computers." Dr. Dobb's Journal, July 2000.
- KATE83 Katevenis, M. Reduced Instruction Set Computer Architectures for VLSI. Ph.D. dissertation, Computer Science Department, University of California at Berkeley, October 1983. Reprinted by MIT Press, Cambridge, MA, 1985.
- **KATH01** Kathail. B.; Schlansker, M.; and Rau, B. "Compiling for EPIC Architectures." *Proceedings of the IEEE*, November 2001.
- KATZ89 Katz, R.; Gibson, G.; and Patterson, D. "Disk System Architecture for High Performance Computing." *Proceedings of the IEEE*, December 1989.
  KEET01 Keeth, B., and Baker, R. DRAM Circuit Design: A Tutorial. Piscataway,

NJ: IEEE Press, 2001.

KHUR01 Khurshudov, A. The Essential Guide to Computer Data Storage. Upper Saddle River, NJ: Prentice Hall, 2001.

**KNUT71** Knuth, D. "An Empirical Study of FORTRAN Programs." Software Practice and Experience, vol. 1, 1971.

KNUT98 Knuth, D. The Art of Computer Programming, Volume 2: Seminumerical Algorithms. Reading, MA: Addison-Wesley, 1998.

**KUCK72** Kuck, D.; Muraoka, Y.; and Chen, S. "On the Number of Operations Simultaneously Executable in Fortran-like Programs and Their Resulting Speedup." *IEEE Transactions on Computers*, December 1972.

KUGA91 Kuga, M.; Murakami, K.; and Tomita, S. "DSNS (Dynamically-hazard resolved, Statically-code-scheduled, Nonuniform Superscalar): Yet Another Superscalar Processor Architecture." Computer Architecture News, June 1991.

- **LEE91** Lee, R.; Kwok, A.; and Briggs, F. "The Floating Point Performance of a Superscalar SPARC Processor." *Proceedings, Fourth International Conference on Architectural Support for Programming Languages and Operating Systems,* April 1991.
- LILJ88 Lilja, D. "Reducing the Branch Penalty in Pipelined Processors." Computer, July 1988.
- LILJ93 Lilja, D. "Cache Coherence in Large-Scale Shared-Memory Multiprocessors: Issues and Comparisons." ACM Computing Surveys, September 1993.
- LOVE96 Lovett, T., and Clapp, R. "Implementation and Performance of a CC-NUMA System." *Proceedings, 23rd Annual International Symposium on Computer Architecture*, May 1996.
- LUND77 Lunde, A. "Empirical Evaluation of Some Features of Instruction Set

Processor Architectures." Communications of the ACM, March 1977.

- LYNC93 Lynch, M. Microprogrammed State Machine Design. Boca Raton, FL: CRC Press, 1993.
- MACG84 MacGregor, D.; Mothersole, D.; and Moyer, B. "The Motorola MC68020." IEEE Micro, August 1984.
- MAHL94 Mahlke, S., et al. "Characterizing the Impact of Predicated Execution on Branch Prediction." *Proceedings, 27th International Symposium on Microarchitecture*, December 1994.
- MAHL95 Mahlke, S., et al. "A Comparison of Full and Partial Predicated Execution Support for ILP Processors." *Proceedings, 22nd International Symposium on Computer Architecture*, June 1995.
- MAK97 Mak, P., et al. "Shared-Cache Clusters in a System with a Fully Shared Memory." *IBM Journal of Research and Development*, July/September 1997.

MALL75 Mallach, E. "Emulation Architecture." Computer, August 1975.

MALL83 Mallach, E., and Sondak, N. Advances in Microprogramming. Dedham, MA: Artech House, 1983.

- MANJ01a Manjikian, N. "More Enhancements of the SimpleScalar Tool Set." Computer Architecture News, September 2001.
- MANJ01b Manjikian, N. "Multiprocessor Enhancements of the SimpleScalar Tool Set." Computer Architecture News, March 2001.
- MANO01 Mano, M. Logic and Computer Design Fundamentals. Upper Saddle River, NJ: Prentice Hall, 2001.
- MARC90 Marchant, A. Optical Recording. Reading, MA: Addison-Wesley, 1990.
- MARK00 Markstein, P. IA-64 and Elementary Functions. Upper Saddle River, NJ: Prentice Hall PTR, 2000.
- MASH95 Mashey, J. "CISC vs. RISC (or what is RISC really)." USENET comp.arch newsgroup, article 46782, February 1995.

MASS97 Massiglia, P. The RAID Book: A Storage System Technology Handbook. St. Peter, MN: The Raid Advisory Board, 1997.

MAYB84 Mayberry, W., and Efland, G. "Cache Boosts Multiprocessor Performance." Computer Design, November 1984.

MCEL85 McEliece, R. "The Reliability of Computer Memories." Scientific American, January 1985.

MEE96a Mee, C., and Daniel, E. eds. Magnetic Recording Technology. New York: McGraw-Hill, 1996.

MEE96b Mee, C., and Daniel, E. eds. Magnetic Storage Handbook. New York: McGraw-Hill, 1996.

MILE00 Milenkovic, A. "Achieving High Performance in Bus-Based Shared-Memory Multiprocessors." *IEEE Concurrency*, July-September 2000.

MIRA92 Mirapuri, S.; Woodacre, M.; and Vasseghi, N. "The MIPS R4000 Processor." *IEEE Micro*, April 1992.

MOOR65 Moore, G. "Cramming More Components Onto Integrated Circuits." Electronics Magazine, April 19, 1965.

MORS78 Morse, S.; Pohlman, W.; and Ravenel, B. "The Intel 8086 Microprocessor: A 16-bit Evolution of the 8080." *Computer*, June 1978.

MOSH01 Moshovos, A., and Sohi, G. "Microarchitectural Innovations: Boosting Microprocessor Performance Beyond Semiconductor Technology Scaling." *Proceedings of the IEEE*, November 2001.

MOTO01 Motorola, Inc. PowerPC MPC7410 RISC Microprocessor Hardware Specifications. Denver, CO: 2001. www.motorola.com

MYER78 Myers, G. "The Evaluation of Expressions in a Storage-to-Storage Architecture." Computer Architecture News, June 1978.

NAYF96 Nayfeh, B.; Olukotun, K.; and Singh, J. "The Impact of Shared Cache Clustering in Small-Scale Shared-Memory Multiprocessors." Proceedings of the Second International Symposium on High Performance Computer Architecture, 1996.

NOVI93 Novitsky, J.; Azimi, M.; and Ghaznavi, R. "Optimizing Systems Performance Based on Pentium Processors." *Proceedings COMPCON* '92, February 1993.

**OBER97a** Oberman, S., and Flynn, M. "Design Issues in Division and Other Floating-Point Operations." *IEEE Transactions on Computers*, February 1997.

**OBER97b** Oberman, S., and Flynn, M. "Division Algorithms and Implementations." *IEEE Transactions on Computers*, August 1997.

**OVER01** Overton, M. Numerical Computing with IEEE Floating Point Arithmetic. Philadelphia, PA: Society for Industrial and Applied Mathematics, 2001.

PADE81 Padegs, A. "System/360 and Beyond." IBM Journal of Research and Development, September 1981.

PADE88 Padegs, A.; Moore, B.; Smith, R.; and Buchholz, W. "The IBM System/370 Vector Architecture: Design Considerations." *IEEE Transactions on Communications*, May 1988.

- **PARH00** Parhami, B. Computer Arithmetic: Algorithms and Hardware Design. Oxford: Oxford University Press, 2000.
- **PARK89** Parker, A., and Hamblen, J. An Introduction to Microprogramming with Exercises Designed for the Texas Instruments SN74ACT8800 Software Development Board. Dallas, TX: Texas Instruments, 1989.
- **PATT82a** Patterson, D., and Sequin, C. "A VLSI RISC." Computer, September 1982.
- **PATT82b** Patterson, D., and Piepho, R. "Assessing RISCs in High-Level Language Support." *IEEE Micro*, November 1982.

PATT84 Patterson, D. "RISC Watch." Computer Architecture News, March 1984.
PATT85a Patterson, D. "Reduced Instruction Set Computers." Communications of the ACM. January 1985.

- **PATT85b** Patterson, D., and Hennessy, J. "Response to 'Computers, Complexity, and Controversy.'" *Computer*, November 1985.
- PATT88 Patterson, D.; Gibson, G.; and Katz, R. "A Case for Redundant Arrays of Inexpensive Disks (RAID)." Proceedings, ACM SIGMOD Conference of Management of Data, June 1988.
- **PATT98** Patterson, D., and Hennessy, J. Computer Organization and Design: The Hardware/Software Interface. San Mateo, CA: Morgan Kaufmann, 1998.
- **PATT01** Patt, Y. "Requirements, Bottlenecks, and Good Fortune: Agents for Microprocessor Evolution." *Proceedings of the IEEE*, November 2001.
- **PEIR99** Peir, J.; Hsu, W.; and Smith, A. "Functional Implementation Techniques for CPU Cache Memories." *IEEE Transactions on Computers*, February 1999.
- **PELE97** Peleg, A.; Wilkie, S.; and Weiser, U. "Intel MMX for Multimedia PCs." Communications of the ACM, January 1997.
- **PFIS98** Pfister, G. In Search of Clusters. Upper Saddle River, NJ: Prentice Hall, 1998.
- POPE91 Popescu, V., et al. "The Metaflow Architecture." IEEE Micro, June 1991.
- **POTT94** Potter, T., et al. "Resolution of Data and Control-Flow Dependencies in the PowerPC 601." *IEEE Micro*, October 1994.
- **PRES01** Pressel, D. "Fundamental Limitations on the Use of Prefetching and Stream Buffers for Scientific Applications." *Proceedings, ACM Symposium on Applied Computing*, March 2001.
- PRIN91 Prince, B. Semiconductor Memories. New York: Wiley, 1991.
- **PRIN99** Prince, B. High Performance Memories: New Architecture DRAMs and SRAMs, Evolution and Function. New York: Wiley, 1999.

PRZY88 Przybylski, S.; Horowitz, M.; and Hennessy, J. "Performance Trade-offs in Cache Design." Proceedings, Fifteenth Annual International Symposium on Computer Architecture, June 1988.

**PRZY90** Przybylski, S. "The Performance Impact of Block Size and Fetch Strategies." Proceedings, 17th Annual International Symposium on Computer Architecture, May 1990.

RADI83 Radin, G. "The 801 Minicomputer." IBM Journal of Research and Development, May 1983.

**RAGA83** Ragan-Kelley, R., and Clark, R. "Applying RISC Theory to a Large Computer." Computer Design, November 1983.

Rauscher, T., and Adams, P. "Microprogramming: A Tutorial and Sur-RAUS80 vey of Recent Developments." IEEE Transactions on Computers, January 1980.

**RECH98** Reches, S., and Weiss, S. "Implementation and Analysis of Path History in Dynamic Branch Prediction Schemes." IEEE Transactions on Computers, August 1998.

RODR01 Rodriguez, M.; Perez, J.; and Pulido, J. "An Educational Tool for Testing Caches on Symmetric Multiprocessors." Microprocessors and Microsystems, June 2001.

**ROSC99** Rosch, W. Winn L. Rosch Hardware Bible. Indianapolis, IN: Sams, 1999.

Satyanarayanan, M., and Bhandarkar, D. "Design Trade-Offs in VAX-SATY81 11 Translation Buffer Organization." Computer, December 1981.

Schaller, R. "Moore's Law: Past, Present, and Future." IEEE Spectrum, SCHA97 June 1997.

Schlansker, M.; and Rau, B. "EPIC: Explicitly Parallel Instruction Com-SCHL00a puting." Computer, February 2000.

SCHL00b Schlansker, M.; and Rau, B. EPIC: An Architecture for Instruction-Level Parallel Processors. HPL Technical Report HPL-1999-111, Hewlett-Packard Laboratories (www.hpl.hp.com), February 2000.

Schwarz, E., and Krygowski, C. "The S/390 G5 Floating-Point Unit." SCHW99 IBM Journal of Research and Development, September/November 1999.

SEBE76 Sebern, M. "A Minicomputer-compatible Microcomputer System: The DEC LSI-11." Proceedings of the IEEE, June 1976.

SEGE91 Segee, B., and Field, J. Microprogramming and Computer Architecture. New York: Wiley, 1991.

SERL86 Serlin, O. "MIPS, Dhrystones, and Other Tales." Datamation, June 1, 1986.

SHAN38 Shannon, C. "Symbolic Analysis of Relay and Switching Circuits." AIEE Transactions, vol. 57, 1938.

SHAN95a Shanley, T., and Anderson, D. PCI Systems Architecture. Richardson, TX: Mindshare Press, 1995.

- SHAN95b Shanley, T. PowerPC System Architecture. Reading, MA: Addison-Wesley, 1995.
- SHAN98 Shanley, T. Pentium Pro and Pentium II System Architecture. Reading, MA: Addison-Wesley, 1998.
- SHAR97 Sharma, A. Semiconductor Memories: Technology, Testing, and Reliability. New York: IEEE Press, 1997.
- SHAR00 Sharangpani, H., and Arona, K. "Itanium Processor Microarchitecture." *IEEE Micro*, September/October 2000.
- SHER84 Sherburne, R. Processor Design Tradeoffs in VLSI. PhD thesis, Report

No. UCB/CSD 84/173, University of California at Berkeley, April 1984.

- SIEW82 Siewiorek, D.; Bell, C.; and Newell, A. Computer Structures: Principles and Examples. New York: McGraw-Hill, 1982.
- SIMA97 Sima, D. "Superscalar Instruction Issue." *IEEE Micro*, September/October 1997.
- SIMO69 Simon, H. The Sciences of the Artificial. Cambridge, MA: MIT Press, 1969.
- SMIT82 Smith, A. "Cache Memories." ACM Computing Surveys, September 1992.
- SMIT87 Smith, A. "Line (Block) Size Choice for CPU Cache Memories." IEEE Transactions on Communications, September 1987.
- **SMIT89** Smith, M.; Johnson, M.; and Horowitz, M. "Limits on Multiple Instruction Issue." *Proceedings, Third International Conference on Architectural Support for Programming Languages and Operating Systems, April 1989.*
- SMIT95 Smith, J., and Sohi, G. "The Microarchitecture of Superscalar Processors." *Proceedings of the IEEE*, December 1995.
- SODE96 Soderquist, P., and Leeser, M. "Area and Performance Tradeoffs in

Floating-Point Divide and Square-Root Implementations." ACM Computing Surveys, September 1996.

- **SOHI90** Sohi, G. "Instruction Issue Logic for High-Performance Interruptable, Multiple Functional Unit, Pipelined Computers." *IEEE Transactions on Computers*, March 1990.
- **STAL00** Stallings, W. Data and Computer Communications, 5th edition. Upper Saddle River, NJ: Prentice Hall, 1997.
- **STAL01** Stallings, W. Operating Systems, Internals and Design Principles, 4th edition. Upper Saddle River, NJ: Prentice Hall, 2001.
- STEN90 Stenstrom, P. "A Survey of Cache Coherence Schemes of Multiprocessors." Computer, June 1990.
- STEV64 Stevens, W. "The Structure of System/360, Part II: System Implementation." *IBM Systems Journal*, Vol. 3, No. 2, 1964. Reprinted in [SIEW82].

**STON93** Stone, H. High-Performance Computer Architecture. Reading, MA: Addison-Wesley, 1993.

STRE78 Strecker, W. "VAX-11/780: A Virtual Address Extension to the DEC PDP-11 Family." Proceedings, National Computer Conference, 1978.

STRE83 Strecker, W. "Transient Behavior of Cache Memories." ACM Transactions on Computer Systems, November 1983.

STRI79 Stritter, E., and Gunter, T. "A Microprocessor Architecture for a Changing World: The Motorola 68000." Computer, February 1979.

SWAR90 Swartzlander, E., editor. Computer Arithmetic, Volumes I and II. Los Alamitos, CA: IEEE Computer Society Press, 1990.

TABA91 Tabak, D. Advanced Microprocessors. New York: McGraw-Hill, 1991.

TAMI83 Tamir, Y., and Sequin, C. "Strategies for Managing the Register File in RISC." IEEE Transactions on Computers, November 1983.

**TANE78** Tanenbaum, A. "Implications of Structured Programming for Machine Architecture." *Communications of the ACM*, March 1978.

**TANE99** Tanenbaum, A. Structured Computer Organization. Englewood Cliffs, NJ: Prentice Hall, 1999.

**THOM94** Thompson, T., and Ryan, B. "PowerPC 620 Soars." *Byte*, November 1994.

**THOM00** Thompson, D. "IEEE 1394: Changing the Way We Do Multimedia Communications." *IEEE Multimedia*, April–June 2000.

TI90 Texas Instruments Inc. SN74ACT880 Family Data Manual. SCSS006C, 1990.

**TJAD70** Tjaden, G., and Flynn, M. "Detection and Parallel Execution of Independent Instructions." *IEEE Transactions on Computers*, October 1970.

TOMA93 Tomasevic, M., and Milutinovic, V. The Cache Coherence Problem in

Shared-Memory Multiprocessors: Hardware Solutions. Los Alamitos, CA: IEEE Computer Society Press, 1993.

TOON81 Toong, H., and Gupta, A. "An Architectural Comparison of Contemporary 16-Bit Microprocessors." *IEEE Micro*, May 1981.

TRIE01 Triebel, W. Itanium Architecture for Software Developers. Intel Press, 2001.

**TUCK67** Tucker, S. "Microprogram Control for System/360." *IBM Systems Journal*, No. 4, 1967.

TUCK87 Tucker, S. "The IBM 3090 System Design with Emphasis on the Vector Facility." Proceedings, COMPCON Spring '87, February 1987.

VOEL88 Voelker, J. "The PDP-8." IEEE Spectrum, November 1988.

VOGL94 Vogley, B. "800 Megabyte Per Second Systems Via Use of Synchronous DRAM." Proceedings, COMPCON '94, March 1994.

VONN45 Von Neumann, J. First Draft of a Report on the EDVAC. Moore School, University of Pennsylvania, 1945. Reprinted in IEEE Annals on the History of Computing, No. 4, 1993.

- VRAN80 Vranesic, Z., and Thurber, K. "Teaching Computer Structures." Computer, June 1980.
- WALL85 Wallich, P. "Toward Simpler, Faster Computers." IEEE Spectrum, August 1985.
- WALL91 Wall, D. "Limits of Instruction-Level Parallelism." Proceedings, Fourth International Conference on Architectural Support for Programming Languages and Operating Systems, April 1991.
- WANG99 Wang, G., and Tafti, D. "Performance Enhancement on Microprocessors with Hierarchical Memory Systems for Solving Large Sparse Linear Systems." International Journal of Supercomputing Applications, vol. 13, 1999.
- WARD90 Ward, S., and Halstead, R. Computation Structures. Cambridge, MA: MIT Press, 1990.
- WEIN75 Weinberg, G. An Introduction to General Systems Thinking. New York: Wiley, 1975.
- WEIS84 Weiss, S., and Smith, J. "Instruction Issue Logic in Pipelined Supercomputers." *IEEE Transactions on Computers*, November 1984.
- WEIS94 Weiss, S., and Smith, J. POWER and PowerPC. San Francisco: Morgan Kaufmann, 1994.
- WEYG01 Weygant, P. Clusters for High Availability. Upper Saddle River, NJ: Prentice Hall, 2001.
- WHIT97 Whitney, S., et al. "The SGI Origin Software Environment and Application Performance." *Proceedings, COMPCON Spring* '97, February 1997.
- WICK97 Wickelgren, I. "The Facts About FireWire." IEEE Spectrum, April 1997.

WILK51 Wilkes, M. "The Best Way to Design an Automatic Calculating Machine." Proceedings, Manchester University Computer Inaugural Conference, July 1951.

- WILK53 Wilkes, M., and Stringer, J. "Microprogramming and the Design of the Control Circuits in an Electronic Digital Computer." Proceedings of the Cambridge Philosophical Society, April 1953. Reprinted in [SIEW82].
- WILL90 Williams, F., and Steven, G. "Address and Data Register Separation on the M68000 Family." Computer Architecture News, June 1990.
- YEH91 Yeh, T., and Patt, Y. "Two-Level Adaptive Training Branch Prediction." Proceedings, 24th Annual International Symposium on Microarchitecture, 1991.
- ZHAN01 Zhang, Z.; Zhu, Z.; and Zhang, X. "Cached DRAM for ILP Processor Memory Access Latency Reduction." *IEEE Micro*, July–August 2001.