COOLCHIPS XIII Advanced Program

Sister Conferences

Advanced Program

PDF version is here (As of 9 April 2010).

Keynote Presentations

POWER7: IBM's Next Generation Server Processor

Abstract: Jim Kahle will present a technical Overview of new Power7 system that IBM launched early this year. Jim will start with a brief recap of IBM innovation and Power History. Jim will then cover technical aspects of the chip, from a new core to enhanced SMP interconnect that includes eDRAM. As chief engineer of the Power7 chip, Jim has unique insight and experience of how create a world class design. It will become evident how Jim's experience in creating chips from laptops, workstations, game consoles to servers can create a break through design. Power7 flexibility from world class single thread execution to 4 way SMT shows how Power7 will power the workload of today and tomorrow. Jim will also describe the first systems that have been announced around this exciting chip. Power7 shows how IBM is a technical leader in server class designs and how it exploits leading edge technology to be a leader in the Industry.

Jim Kahle is a graduate of Rice University, and for more than 25 years at IBM he has held numerous managerial and technical positions. He is a renowned expert in the microprocessor industry, achieving the distinction of IBM Fellow, and currently is the Chief Architect for Power Hybrid systems. Previously he was Chief Technical lead for Power 7. Before that, Jim lead the Collaborative design for Cell which was a partnership with IBM, Sony and Toshiba. He was also Chief Architect of the Power4 core used in IBM servers and Apple's G5. He was project manager for the PowerPC 603 series that are used in Apple Laptops and Nintendo game cubes. Jim has been involved in designs using the Power architecture since its conception. He combines broad processor knowledge with an ability to lead high performance teams and to drive deep client relationships in order to understand future system requirements, achieving breakthrough innovations in chip design.

High Performance and Low Power Processor for PETA Scale Computing

Abstract: SPARC64VIIIfx is a new processor developed for use in peta-scale computing systems. The chip is fabricated in Fujitsu's 45nm CMOS technology, has 8 cores that run at a speed of 2GHz, and achieves a peak performance of 128GFLOPS. The entire chip consumes only 58W of power when executing a maximum power program.

The design goals for this processor were high performance, low power consumption, and high reliability. High performance is achieved via architectural enhancements. Low power consumption is realized through the selection of a moderate chip frequency, which allows dynamic current to be significantly reduced via extensive clock gating and data bus gating. Power consumption is also reduced by the use of a water cooling system; water cooling enables a lower CPU operating temperature, which reduce leakage current. The lower operating temperature also enables high reliability by minimizing the risk of chip failures. RAS features adapted from mission critical servers further increase reliability by protecting against both chip failures and soft errors.

This presentation introduces the high-performance architecture of SPARC64VIIIfx and low-power techniques, with a focus on techniques for reducing the power consumption of the L1 and L2 caches.

Iwao Yamazaki is an engineer in the Next Generation Technical Computing unit at Fujitsu. His technical interests include cache designs for server processors. Yamazaki has a BS in physics from Kyoto University.

Super Camera Technology at NHK

Abstract: Television has realized human demand to see far scenes in real. The progress of camera technology is now overcoming limitation of human vision, such in spatial resolution, temporal resolution, and sensitivity. In this talk, we will introduce three types of NHK's cutting edge camera technology. For a high-spatial resolution, we have developed a Super Hi-vision (SHV) camera. SHV is our future TV broadcasting system and will realize greater sensation of reality. The camera system has 8k by 4k pixels at 60 frames-per-sec progressive scanning. For a high-temporal resolution, we have developed an ultrahigh-speed camera that provides one-million frames-per-sec. For a high-sensitivity, we have developed a Super HARP camera which sensitivity is fifty times higher than conventional CCD camera.

Hiroshi Shimamoto received M.S. and Ph.D degrees in electrical engineering from Tokyo Institute of Technology in 1991 and 2008 respectively. In 1991, he joined NHK (Japan Broadcasting Corporation). Since1993, he has been working on high-dynamic-range camera, HDTV progressive camera, ultra-high definition camera (Super Hi-vision camera), and single-chip color HDTV camera for broadcasting at NHK Science & Technical Research Laboratories and NHK Engineering Services Inc. In 2005-2006, He had been a visiting scholar at Stanford University.

Device Cloud Computing

Abstract : Targeting the audience at this chips conference, we will present a hardware-centric, end-to-end view of cloud computing: from the client side of the cloud computing to the data-center side of it. On the client side, we will present size optimizations for ROM/RAM and performance optimizations. The server side includes both the data center front-end and back-end. According to [Dean & Ghemawat] in CACM 2008, Google processed over 400 PB of data on datacenters composed of thousands of machines in September 2007 alone. What challenges emerge when computing on such a scale? We will describe some general, important advances at hardware level. Finally, we will discuss how cloud computing and client platforms amplify each other.

Shih-Wei Liao (Google Inc., USA): Dr. Shih-wei Liao's recent work includes optimization for data centers, and android. Publications on the former include his Supercomputing'09 paper, "Machine Learning-Based Prefetch Optimization for Data Center Applications" and CGO'10 paper "Taming Hardware Event Samples for Feedback-Directed Optimizing Compilation." The latter is an open-source project. On the industry side, Dr. Liao has more than ten years of product experience (at Intel and Google) and more than twenty patent applications. His career has centered around large-scale services or volume products that his Mom uses too. On the academia side, he publishes extensively on XML processing, multicores, compilers and programming systems. Dr. Liao received his bachelor's degree from National Taiwan University in 1991 and his MS and PhD degrees from Stanford University in 1995 and 2000, respectively.

NVIDIA Tegra, Achitecture of Low Power

Abstract: Personal computers on are the brink of a technology change as devices move from stationary, to all day mobile platforms, where data is stored in the cloud. The consumer demand for mobile computing drives challenging requirements in terms of power and performance. Learn how holistic design created NVIDIA's Tegra to meet consumer needs.

Gordon Grigor of NVIDIA's Tegra business, lead the software development of Tegra, into media player, mobile phone, tablet and netbook devices. NVIDIA Tegra delivers unmatched visual computing, with immersive highly responsive user interfaces, HD digital media, console quality gaming, and desktop experience internet browsing on our very most personal computers.

Electrodes on Chips for Life Science Applications, Solutions for Fully-integrated Systems

Abstract: The interface between electronic circuits and life sciences will be one of the focal points of future integrated system design. Several solutions for electronic devices/biological matter interactions are already available and they have proved their potential to be highly-portable systems or high-throughput systems or both. In this speech, we will address the paradigm of electronic sensors, circuits and systems as privileged means to interact with biological matter at the higher level of detail while bringing the advantage of almost unlimited choice of signal processing, storing and communication solutions. Sensing principles will be presented in a physics and biophysics perspective. High-throughput and integration will be addressed with respect to tradeoffs between high density and signal measurability. A set of biomolecule sensing techniques and nanotechnological amplification means will be presented in their application in silicon-chip measurement systems. The seminar will also tackle the compatibility issues of biochemical processes and solid-state technologies and will describe the different possibilities for developing and scale molecular sensing sites on a chip.

Carlotta Guiducci holds her PhD in Electrical Engineering from the University of Bologna (I). She was a postdoc at the Nanobiophysics Lab at Ecole Superièure de Physique et Chimie Industrielles Paris (F) between 2005 and 2007. Later she went back to Bologna where she coordinated a joint research group of electrical engineers, physicists and biologists funded by an Integrated Project of The EU (DiNamICS) and by national projects. She recently joined The Institute of Bioengineering at the Swiss Federal Institute of Technology in Lausanne (CH) where she holds a position as Tenure-Track Assistant Professor. Her research activity spanned from the characterization of MOS in quantic regime to the development of novel techniques for sensing biological affnity reactions on surfaces by means of semiconductor sensors and electronic transducers. She developed in collaboration with Infineon technologies two test chips for DNA detection by capacitance measurements, which successfully demonstrated the feasibility of the technique. She has been working on electrical, electrochemical and optical techniques. She demonstrated and patented the measurement of DNA by UV absorption on non volatile memory cells. Her laboratory team focuses on the design and application of electronic biosensors and are at the forefront of electronic engineering and bioengineering. The sensors address a wide range of applications, from nucleic acid, protein and drug detection to the measurements of bacterial metabolism and they are based on detection principles supporting electronic transduction, in order to couple directly and integrate the sensors themselves with electronic circuitry for data acquisition. Miniaturization of sensing site and the development of parallel systems are the main aims pursued. She has been invited speaker in several occasions at Stanford University (USA), Ecole Normale Superièure de Paris (France), Research Center Julich (Germany), Infineon Technologies (Germany) and she is reviewer of several international conferences and journals.

Invited presentations

Multi-Voltage Based Low Power Design Trends and Verification Techniques

Abstract: Today's deep submicron semiconductor process technologies offer designers the ability to implement remarkably rich functionality in a very small die area. However, this result in increased high-frequency transistor switching and the resultant dynamic power dissipation directly impacts design reliability, battery life, packaging and cooling costs. In addition, with the migration to 65 nm process technologies and below, leakage power become just as problematic as dynamic power. In order to address dynamic and leakage power dissipation, designers must incorporate various low power design techniques. Semiconductor physics show that voltage control is the key to reducing both dynamic and leakage power. Hence voltage control design techniques are becoming main stream which in turn impose varying degrees of new verification challenges, such as state-space complexity explosion, necessity of voltage aware simulation and power-aware assertions, as well checks for functional, structural and architectural changes in design due to protection cell insertion. Synopsys provides a distinct and unique approach to address low power verification challenges by combining tools and methodology. The Synopsys solution consists of voltage-aware simulation, a static checker that validates power intent throughout the design flow and an industry-first verification methodology based on best-practices from low power experts.

Progyna Khondkar is a senior application consultant at Synopsys Japan, in the product specialist group of engineerig division. He received his Masters degree from Hirosaki University of Japan in March 2001 from the department of electronics and information science. He recived his doctorate from Tohoku University of Japan in Marh 2004 from computer architecture laboratory of graduate school of information science, majoring multithreded and parallel processing architecture for embdded processors.

Special Sessions (invited lectures)

Resolving the Grand Paradox: Low Energy and Full Programmability in 4G Mobile Baseband SOCs

Abstract: Continued improvement in silicon density, combined with acceleration in mobile baseband terminals bandwidth and volume is driving basic change for highly integrated systems. But mobile baseband SOCs face an essential paradox - on one hand, increased mobility dictates smaller batteries, longer battery life and improved energy efficiency. On the other hand, the complexity of new baseband standards like LTE - plus the multimedia, network protocols and application services enabled by fast baseband - dictate increased programmability, ubiquitous multi-core and more software layers. How could this possibly work? This talk describes practical successes for ultra-low energy processors used for LTE PHY subsystem designs achieving 150Mbps data rates in less than 250mW. And resolving this paradox has a domino effect on wireless infrastructure, DTV and wired communications.

Dr. Chris Rowen is the founder, chief technical officer, and a member of the board of directors of Tensilica, Inc. He founded Tensilica in July 1997 to develop automatic generation of application-specific microprocessor cores for high-volume communication and consumer systems. Using the approach Chris pioneered, customers today are achieving benchmark-breaking performance while significantly lowering power requirements - results that are not possible using traditional semiconductor design approaches. From his early days as a physics major at Harvard University, Chris has always had a passion for innovation. Right out of college he plunged into leading-edge technology development at Intel in the late 1970s. There, he learned how semiconductor scaling was starting to drive the electronics universe. His passion drove him to pursue a doctorate degree in electrical engineering from Stanford University in the early 1980s. At Stanford, Chris met a young assistant professor, John Hennessy, who was forming a group to study microprocessor architectures. John went on to become one of the most prominent computer architects of the last three decades and president of Stanford University. As part of his research team, Chris helped co-develop a concept commonly called Reduced Instruction Set Computing (RISC). The project formed the basis for a new company, called MIPS, which Chris helped to found in 1984, and for Chris' early work in the study of automated logic synthesis. Filling a variety of roles, he soon became vice president of microprocessor development, until MIPS was acquired by Silicon Graphics in 1992. Chris was presented with the opportunity to run MIPS in Europe, becoming a kind of Silicon Graphics European CTO and market development leader for graphics, supercomputing and the Internet in the mid 1990s. In 1996, Chris moved back to California to become general manager of the Design Reuse Group at Synopsys in the early days of system-on-chip (SOC). He led Synopsys' definition of products and strategies for large-scale intellectual property blocks and design reuse tools. This experience helped him realize the limitations of the current hardware-only oriented EDA (electronic design automation) mindset and the shortcomings of existing embedded processor cores for SOC design. Deciding to explore the potential for a new type of processor on his own, he left Synopsys and set up shop in his library at home. Out of this came the realization that a new form of processor - the configurable processor - held the potential to fundamentally change the way complex SOCs are designed. By providing tools that automate and speed the design of configurable processors, he believed he could fundamentally change the way SOCs are designed. This was the genesis for Tensilica. Chris is well known as a speaker on complex technology and business issues, has authored the book, "Engineering the Complex SOC" (published by Prentice Hall in 2004) and numerous technical articles and conference papers, and he holds over two dozen US and international patents.

Convergence of Design and Fabrication Technologies, a Key Enabler for Multi-layer HW-SW Integration

Abstract: The last decade was dominated by HW-SW convergence where designers learned to combine hardware and software design to cope with the increased demand of lower cost and increased performances. This starting decade will be dominated by the convergence between design technology (HW and SW) and fabrication technologies. In fact more and more designs require a deep knowledge of technology characteristics to reach the required performances. On the other side, design technologies are more and more used to overcome fabrication process imperfection and to improve yield.
This talk will first explain the achievements in HW-SW convergence and SoC design. Then, it will address the fabrication technology trends and challenges to deal with this convergence.

Dr. Ahmed Jerraya is Director of Strategic Design Programs at CEA/LETI France. He served as General Chair for the Conference DATE in 2001, Co-founded MPSoC Forum (Multiprocessor system on chip) and served as the organization chair of ESWEEK2009. He supervised 51 PhD, co-authored 8 Books and published more than 250 papers in International Conferences and Journals.

Achieving Fast Design Closure Using Networks on Chips

Abstract: The growing complexity of Systems on Chips (SoCs) and Chip Multi-Processors (CMPs) is requiring communication resources that can only be provided by a highly-scalable Networks on Chip (NoC) based communication infrastructure. Developing NoC-based systems tailored to a particular application domain is important for achieving high-performance, energy-efficient customized solutions. To achieve early time-to-market, it is important to have a CAD tool flow that automates most of the time-intensive design steps. In this talk, I will first present the basics of NoCs: covering several aspects including topology design, routing and flow control. Then, I will show why a CAD flow is crucial in solving the NoC design problem efficiently and for achieving design closure.

Srinivasan Murali is a co-founder and CTO of iNoCs. He also holds a research scientist position at EPFL. He received the MS and PhD degrees in Electrical Engineering from Stanford University in 2007. His research interests include interconnect design for Systems on Chips, with particular emphasis on developing CAD tools and design methods for Networks on Chips. His interests also include thermal modeling and reliability of multi-core systems. He is a recipient of the EDAA outstanding dissertation award for 2007 for his work on interconnect architecture design. He received a best paper award at the DATE 2005 conference and a best paper nomination at the ICCAD 2006 conference. One of his papers has also been selected as one of "The Most Influential Papers of 10 Years DATE". He has authored a book and has over 40 publications in leading conferences and journals in this field. He has been actively involved in several conferences (such as DATE, CODES-ISSS, NoC symposium, VLSI-SoC) as a program committee member/session chair and is a reviewer for many leading conferences and journals.

Panel Discussion

Topics: "What is the Future Multi-layer Co-design of Computer Systems?"

Organizer and Moderator:

Fumio Arakawa (Renesas Tech., Japan)

Panelists:

Jim Kahle (IBM Corp., USA)
Shih-Wei Liao (Google Inc., USA)
Chris Rowen (Tensilica, USA)
Srinivasan Murali (EPFL/iNoCs, Switzerland)
Ahmed Jerraya (CEA-LETI, France)
Iwao Yamazaki (Fujitsu, Japan)
Keiji Kimura (Waseda Univ., Japan)

Abstract: We can realize extreme computing power if there is no ILP/Power/Memory walls. However, that is not the case, yet. Now, we are moving to multicore/manycore systems to overcome the power walls. Further, the multi-layer co-design is the key to overcome the ILP and memory walls. Excellent co-design will bring an extreme computing power to various future "cool" applications, like cloud computing, digital consumer, mobile, CIS, and so on. A computer system has a multi-layer structure of hardware and software, including drivers, OSes, middleware, and applications. Those layers are defined by APIs, like OpenMP, OpenCL, OSCAR API, and various tools and platforms are provided for the system constructions, such as Android, Chrome OS, and so on. However, they are still on the way of evolution. So, on the panel, we will discuss the future multi-layer co-design of computer systems.

Fumio Arakawa is a chief researcher of microprocessors in System Core Development Division of Renesas Electronics Corp. in Tokyo, Japan. He received his BS and MS degrees in applied physics and his Ph.D degree in electrical engineering from the University of Tokyo in 1984, 1986, and 2007, respectively. He joined Central Research Laboratory of Hitachi, Ltd in 1986, moved to Renesas Technology Corp. in 2009, which is now Renesas Electronics Corp. He is a member of IEICE and IEEE.