Chip Multiprocessor Architecture: Techniques to Improve Throughput and Latency / Edition 1

Chip Multiprocessor Architecture: Techniques to Improve Throughput and Latency / Edition 1

by Kunle Olukotun
ISBN-10:
159829122X
ISBN-13:
9781598291223
Pub. Date:
12/01/2007
Publisher:
Morgan and Claypool Publishers
ISBN-10:
159829122X
ISBN-13:
9781598291223
Pub. Date:
12/01/2007
Publisher:
Morgan and Claypool Publishers
Chip Multiprocessor Architecture: Techniques to Improve Throughput and Latency / Edition 1

Chip Multiprocessor Architecture: Techniques to Improve Throughput and Latency / Edition 1

by Kunle Olukotun

Paperback

$40.0 Current price is , Original price is $40.0. You
$40.00 
  • SHIP THIS ITEM
    Qualifies for Free Shipping
  • PICK UP IN STORE
    Check Availability at Nearby Stores

Overview

Chip multiprocessors - also called multi-core microprocessors or CMPs for short - are now the only way to build high-performance microprocessors, for a variety of reasons. Large uniprocessors are no longer scaling in performance, because it is only possible to extract a limited amount of parallelism from a typical instruction stream using conventional superscalar instruction issue techniques. In addition, one cannot simply ratchet up the clock speed on today's processors, or the power dissipation will become prohibitive in all but water-cooled systems. Compounding these problems is the simple fact that with the immense numbers of transistors available on today's microprocessor chips, it is too costly to design and debug ever-larger processors every year or two. CMPs avoid these problems by filling up a processor die with multiple, relatively simpler processor cores instead of just one huge core. The exact size of a CMPs cores can vary from very simple pipelines to moderately complex superscalar processors, but once a core has been selected the CMPs performance can easily scale across silicon process generations simply by stamping down more copies of the hard-to-design, high-speed processor core in each successive chip generation. In addition, parallel code execution, obtained by spreading multiple threads of execution across the various cores, can achieve significantly higher performance than would be possible using only a single core. While parallel threads are already common in many useful workloads, there are still important workloads that are hard to divide into parallel threads. The low inter-processor communication latency between the cores in a CMP helps make a much wider range of applications viable candidates for parallel execution than was possible with conventional, multi-chip multiprocessors; nevertheless, limited parallelism in key applications is the main factor limiting acceptance of CMPs in some types of systems.

Product Details

ISBN-13: 9781598291223
Publisher: Morgan and Claypool Publishers
Publication date: 12/01/2007
Series: Synthesis Lectures on Computer Architecture Series , #3
Pages: 143
Product dimensions: 7.40(w) x 9.10(h) x 0.40(d)

About the Author

Kunle Olukotun is a Professor of Electrical Engineering and Computer Science at Stanford University. Olukotun led the Stanford Hydra project which developed the first chip multiprocessor (multicore chip) with support for thread-level speculation. Using insights gained from the Hydra project, Olukotun founded Afara Websystems to demonstrate the benefits of chip multiprocessor technology for high-throughput, low power server systems. Afara microprocessor technology, called Niagara, was acquired by Sun Microsystems. The Niagara based Sun Fire CoolThreads servers have become one of Sun's fastest ramping products ever. Olukotun is actively involved in research in computer architecture, parallel programming environments and scalable parallel systems. Currently, Olukotun directs the Stanford Pervasive Parallelism Lab (PPL) which seeks to proliferate the use of parallelism in all application areas. Olukotun is a Fellow of the ACM. Olukotun received his Ph.D. in Computer Engineering from The University of Michigan. James Laudon is a Distinguished Engineer with Sun Microsystems. His areas of expertise include multithreading, multiprocessors, and performance modelling. He is currently focused on the architecture of future generations in the UltraSPARC T1 chip multiprocessor line. James joined Sun in July of 2002 through the acquisition of Afara Websystems.

Table of Contents

The Case for CMPs     1
A New Approach: The Chip Multiprocessor (CMP)     5
The Application Parallelism Landscape     6
Simple Example: Superscalar vs. CMP     8
Simulation Results     12
This Book: Beyond Basic CMPs     17
Improving Throughput     21
Simple Cores and Server Applications     24
The Need for Multithreading within Processors     24
Maximizing the Number of Cores on the Die     25
Providing Sufficient Cache and Memory Bandwidth     26
Case Studies of Throughput-oriented CMPs     26
Example 1: The Piranha Server CMP     26
Example 2: The Niagara Server CMP     34
Example 3: The Niagara 2 Server CMP     44
Simple Core Limitations     47
General Server CMP Analysis     48
Simulating a Large Design Space     48
Choosing Design Datapoints     51
Results     53
Discussion     54
Improving Latency Automatically     61
Pseudo-parallelization: "Helper" Threads     62
Automated Parallelization Using Thread-Level Speculation (TLS)     63
An Example TLS System: Hydra     70
The Base Hydra Design     70
Adding TLS to Hydra     71
Using Feedback from Violation Statistics     80
Performance Analysis     84
Completely Automated TLS Support: The JRPM System     88
Concluding Thoughts on Automated Parallelization     99
Improving Latency Using Manual Parallel Programming     103
Using TLS Support as Transactional Memory     104
An Example: Parallelizing Heapsort Using TLS     105
Parallelizing SPEC2000 with TLS     114
Transactional Coherence and Consistency (TCC): More Generalized Transactional Memory     116
TCC Hardware     118
TCC Software     121
TCC Performance     127
Mixing Transactional Memory and Conventional Shared Memory     136
A Multicore World: The Future of CMPs     141
Author Biography     145

From the B&N Reads Blog

Customer Reviews