Overview of Data Centers Energy Efficiency Evolution

Lennart Johnsson

December 2010

Department of Computer Science
University of Houston
Houston, TX

School of Computer Science and Communications
Royal Institute of Technology
Stockholm, Sweden

Abstract

Data centers are becoming one of the major consumers of energy in modern economies accounting for a measurable fraction of the total energy consumption. Furthermore, the energy consumption of data centers has a high growth rate with the energy consumption doubling about every six years. With large data centers consuming several MW of power or even 10’s of MW energy efficiency and energy consumption of data centers have become of great importance economically and environmentally. Up to a decade or so ago it was common that the computing equipment in data centers only consumed a third or less of the total energy consumption. Today, in well-designed data centers the computing equipment consumes at least 60 – 70% of the total power, and in some designs as much as 90% or more of the total power. To further enhance energy efficiency various forms of energy reuse is also pursued thereby reducing the energy consumption for other purposes and hence also contributing to reduced emissions.

Reducing the energy consumption for computing has for the last decade also been one of the key drivers for CPU and platform vendors. Increasingly, energy efficient computing system and data center design requires an integrated approach to server and system design, cooling and potential energy reuse. In this chapter we will review recent developments as well as emerging technologies for energy efficient server and data center designs, as well as the challenges in using clean energy sources.
1. Introduction

Energy efficiency in computation has become the number one concern for infrastructure providers for environmental and cost reasons. Both concerns have driven and continue to drive energy efficiency in design of data centers and computer systems and component technologies. The concerns also have impacted how data centers and systems are operated with dynamic management of major system components increasingly being introduced, and also impacted the selection of energy sources for major data centers, and their location.

There is a huge difference in greenhouse gas emissions by different energy sources. A study regarding emissions related to electricity generation by the UK Parliamentary Office of Science and Technology carried out in 2006 [1] resulted in the life-cycle assessment of CO₂ and other greenhouse gas emissions shown in Table 1. The emissions are expressed as gram CO₂ equivalents (gCO₂eq) per kWh. This measure accounts for the warming effects of CO₂ and other greenhouse gases. As can be seen from Table 1, the range in gCO₂eq/kWh between the energy sources with the lowest and highest emissions is more than a factor of 200. Life-cycle assessment includes greenhouse gas emissions for all stages related to electricity generation including plant construction, operation, maintenance and decommissioning and fuel extraction, transport and processing.

<table>
<thead>
<tr>
<th>Energy Source</th>
<th>Coal</th>
<th>Oil</th>
<th>Gas</th>
<th>Biomass</th>
<th>Solar PV</th>
<th>Marine</th>
<th>Hydro</th>
<th>Wind</th>
<th>Nuclear</th>
</tr>
</thead>
<tbody>
<tr>
<td></td>
<td>&gt;1000</td>
<td>~650</td>
<td>~500</td>
<td>25 - 93</td>
<td>35 - 58</td>
<td>25 - 50</td>
<td>5 - 30</td>
<td>~5</td>
<td>~5</td>
</tr>
</tbody>
</table>

Table 1. Life-cycle assessment of Greenhouse Gas Emissions expressed as grams of CO₂ equivalents per kWh (gCO₂e/kWh) with 2006 technologies [1].

Unfortunately, much of the world’s electricity generation is based on coal. According to the 2010 Key World Energy Statistics by the Internal Energy Agency (IEA) [2] and the Pew Center on Global Climate Change Climate TechBook’s Electricity Generation Overview [3] about 41% of electricity is based on coal, about 20% on natural gas and about 6% on oil. Thus, about 67% of all electric energy comes from the three sources that generate the most gCO₂eq/kWh. Hydroelectric energy accounts for 16% of total world electric energy generation and nuclear energy for 15%. The world electricity generation by energy source in 2008 is summarized in Table 2.

<table>
<thead>
<tr>
<th>Energy Source</th>
<th>Coal</th>
<th>Oil</th>
<th>Gas</th>
<th>Hydro</th>
<th>Nuclear</th>
<th>Other</th>
</tr>
</thead>
<tbody>
<tr>
<td></td>
<td>41%</td>
<td>6%</td>
<td>20%</td>
<td>16%</td>
<td>15%</td>
<td>2%</td>
</tr>
</tbody>
</table>

Table 2. World electric energy generation by energy source, 2008 [2].

The result is that electric energy generation from fossil fuel energy sources accounts for almost all greenhouse gas emissions for electric energy generation, with about 73% of CO₂ emissions for electricity generation due to coal as a source of energy, about 19% due to natural gas and 7% due to oil. Table 3 summarizes the greenhouse gas emissions for world electricity generation by energy source from Tables 1 and 2.
The need for electric energy is expected to more than double by 2050, and dramatic changes in electric energy production are necessary to meet climate change targets. In the IEA Blue Scenario [4] it is estimated that limiting the temperature rise to 2°C by 2050 will require a 50% reduction in greenhouse gas emissions compared to 2005. To achieve this level of reduction in greenhouse gas emissions due to electricity generation the gCO₂eq/kWh emissions must be reduced by 85% compared to 2008, which requires a drastic change in energy source for electric energy as shown in Figure 1. All coal generation of electric energy will need to use carbon capture and storage (CCS) techniques and coal reduced to account for about 12% of all electric energy generation, or about 5PWh of an estimated total 40PWh. Natural gas is projected to account for about 10% and natural gas with CCS another 2%. Oil as an energy source for electric energy is projected to be very small and hence fossil fuel reduced to account for about 27% of all electric energy generation. Nuclear power based electric energy is projected to account for about as much as fossil fuel based electric energy, or about 25% with electric energy from renewable sources accounting for close to half of all electric energy.

Figure 1. Electric energy by source [4]. Decarbonizing the electricity sector to limit temperature rise to about 2°C by 2050.
2. Data centers

2.1 Energy
The contribution to greenhouse gas emissions of the Information and Communications Technology (ICT) sector, though small, is growing faster than the overall growth in emissions. The Smart2020 report [5] estimated that total emissions from all sources will increase by about 30% from 2002 to 2020, while the ICT sector emissions (including PCs) will grow by 180% during the same period. However, the report also estimated that by 2020 the ICT sector will contribute to a reduction in emissions in other sectors more than five-fold its own emissions. The benefit on the overall energy consumption due to the ICT sector was also studied by the American Council for an Energy Efficient Economy. This study found that, for the United States, for every kWh consumed by the IT industry about 10 kWh is saved in other parts of the economy [6]. Though the economies are different in different countries, in most modern economies IT should be part of the solution for more energy efficient and environmentally friendly economies.

The growth in greenhouse gas emissions by data centers is predicted by the Smart2020 report to grow even faster than the overall emissions by the ICT sector, or by 240% from 2002 to 2020. The capital and operation costs of energy for operating and cooling computer systems have increased very rapidly over the last 15 – 20 years. Though there is a wide range in power consumption for servers used for HPC systems, servers based on the x86 architecture have come to dominate the HPC systems market. Therefore, the energy consumption of this type of server can be used for a view of typical HPC center’s energy efficiency and cost evolution. The energy consumption for a typical x86 based server has increased during the last decade from somewhat less than 100W on average to about 250W [7] an almost three-fold increase while server costs have remained fairly constant or even decreased slightly [7]. In 2007 [8] Belady estimated that in 2008 energy cost for operating and cooling a standard x86 server would equal the cost of the server, while already in 2004, the capital expense for power and cooling equaled the cost of the server. The combined capital and operating cost for power and cooling equaled the server cost already in 2001 according to Belady, and by now, 2010, life-time power and cooling costs amounts to more than double the server cost as shown in Figure 2.

![Figure 2. Evolution of US power and cooling costs for a standard x86 server.](source.jpg)
Though it is difficult to find good data on the cost of energy and transport losses delivered to a data center from different sources, and energy requirements for information transport between a data center and its users, it is a generally held opinion that it is economically advantageous to locate data centers close to an electric power source, preferably an inexpensive, clean and renewable source of energy, or in a location in which cooling can be realized at low cost. Some insight into the cost of data transport to and from a data center can be gained from the studies that have estimated the energy efficiency of the Internet. According to [9], the Internet in 2008 required about 7 kWh/GB of traffic with an efficiency improvement of 30% per year, Figure 3. However, about 2/3rds of this energy consumption is estimated to be due to servers and storage systems [10], and only about 1/3rd due to data transport, as seen from Table 4.

![Figure 3. Electricity intensity of the Internet [9]](image)

### Internet and Phone System Direct Energy Use

<table>
<thead>
<tr>
<th>Equipment Type</th>
<th>2000 electricity use (TWh/year)</th>
<th>2006 electricity use (TWh/year)</th>
</tr>
</thead>
<tbody>
<tr>
<td>Internet (a)</td>
<td>19.3</td>
<td>42.3</td>
</tr>
<tr>
<td>Servers (b)</td>
<td>11.6</td>
<td>24.5</td>
</tr>
<tr>
<td>Data Storage (c)</td>
<td>1.5</td>
<td>4.4</td>
</tr>
<tr>
<td>Hubs (d)</td>
<td>1.6</td>
<td>3.5</td>
</tr>
<tr>
<td>Routers (e)</td>
<td>1.1</td>
<td>2.4</td>
</tr>
<tr>
<td>LAN Switches (d)</td>
<td>3.3</td>
<td>7.2</td>
</tr>
<tr>
<td>WAN Switches (d)</td>
<td>0.2</td>
<td>0.3</td>
</tr>
<tr>
<td>Telephone Systems (a)</td>
<td>3.8</td>
<td>2.5</td>
</tr>
<tr>
<td>Transmission (e)</td>
<td>1.8</td>
<td>1.2</td>
</tr>
<tr>
<td>Public Phone Network (e)</td>
<td>1.0</td>
<td>0.7</td>
</tr>
<tr>
<td>Private Branch Exchanges (PBX) (e)</td>
<td>1.0</td>
<td>0.7</td>
</tr>
<tr>
<td><strong>Total</strong></td>
<td><strong>23.1</strong></td>
<td><strong>44.9</strong></td>
</tr>
</tbody>
</table>

*a. These estimates do not include energy use for ventilation, cooling, and auxiliary equipment.*
From EPA (2007). Includes energy use from all types of servers.
d. Year 2000 data from Roth (2002), year 2006 value scaled by growth in total phone system data traffic.
e. The estimated decline of energy use in transmission equipment for voice traffic may be offset somewhat by increasing energy use of co-located transmission equipment carry data traffic.

Table 4. Energy consumption in the Internet [10]

The cost of electricity from different sources may vary significantly by location, but for the US, the predictions made by the US Department of Energy’s Energy Information Administration (EIA) gives an indication of expected energy costs by source. Table 5 shows predictions for 2016. Of energy sources with low environmental impact, hydro, biomass, geothermal and nuclear are estimated to be very cost competitive.

Estimated Levelized Cost of New Generation Resources, 2016

<table>
<thead>
<tr>
<th>Plant Type</th>
<th>Capacity Factor (%)</th>
<th>U.S. Average Levelized Costs (2008 $/megawatthour) for Plants Entering Service in 2016</th>
<th>Levelized Capital Cost</th>
<th>Fixed O&amp;M</th>
<th>Variable O&amp;M (including fuel)</th>
<th>Transmission Investment</th>
<th>Total System Levelized Cost</th>
</tr>
</thead>
<tbody>
<tr>
<td>Conventional Coal</td>
<td>85</td>
<td></td>
<td>69.2</td>
<td>3.8</td>
<td>23.9</td>
<td>3.6</td>
<td>100.4</td>
</tr>
<tr>
<td>Advanced Coal</td>
<td>85</td>
<td></td>
<td>81.2</td>
<td>5.3</td>
<td>20.4</td>
<td>3.6</td>
<td>110.5</td>
</tr>
<tr>
<td>Advanced Coal with CCS</td>
<td>85</td>
<td></td>
<td>92.6</td>
<td>6.3</td>
<td>26.4</td>
<td>3.9</td>
<td>129.3</td>
</tr>
<tr>
<td>Natural Gas-fired</td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
<tr>
<td>Conventional Combined Cycle</td>
<td>87</td>
<td></td>
<td>22.9</td>
<td>1.7</td>
<td>54.9</td>
<td>3.6</td>
<td>83.1</td>
</tr>
<tr>
<td>Advanced Combined Cycle</td>
<td>87</td>
<td></td>
<td>22.4</td>
<td>1.6</td>
<td>51.7</td>
<td>3.6</td>
<td>79.3</td>
</tr>
<tr>
<td>Advanced CC with CCS</td>
<td>87</td>
<td></td>
<td>43.8</td>
<td>2.7</td>
<td>63.0</td>
<td>3.8</td>
<td>113.3</td>
</tr>
<tr>
<td>Conventional Combustion Turbine</td>
<td>30</td>
<td></td>
<td>41.1</td>
<td>4.7</td>
<td>82.9</td>
<td>10.8</td>
<td>139.5</td>
</tr>
<tr>
<td>Advanced Combustion Turbine</td>
<td>30</td>
<td></td>
<td>38.5</td>
<td>4.1</td>
<td>70.0</td>
<td>10.8</td>
<td>123.5</td>
</tr>
<tr>
<td>Advanced Nuclear</td>
<td>90</td>
<td></td>
<td>94.9</td>
<td>11.7</td>
<td>9.4</td>
<td>3.0</td>
<td>119.0</td>
</tr>
<tr>
<td>Wind</td>
<td>34.4</td>
<td></td>
<td>130.5</td>
<td>10.4</td>
<td>0.0</td>
<td>8.4</td>
<td>149.3</td>
</tr>
<tr>
<td>Wind - Offshore</td>
<td>39.3</td>
<td></td>
<td>159.9</td>
<td>23.8</td>
<td>0.0</td>
<td>7.4</td>
<td>191.1</td>
</tr>
<tr>
<td>Solar PV</td>
<td>21.7</td>
<td></td>
<td>376.8</td>
<td>6.4</td>
<td>0.0</td>
<td>13.0</td>
<td>396.1</td>
</tr>
<tr>
<td>Solar Thermal</td>
<td>31.2</td>
<td></td>
<td>224.4</td>
<td>21.8</td>
<td>0.0</td>
<td>10.4</td>
<td>256.6</td>
</tr>
<tr>
<td>Geothermal</td>
<td>90</td>
<td></td>
<td>88.0</td>
<td>22.9</td>
<td>0.0</td>
<td>4.8</td>
<td>115.7</td>
</tr>
<tr>
<td>Biomass</td>
<td>83</td>
<td></td>
<td>73.3</td>
<td>9.1</td>
<td>24.9</td>
<td>3.8</td>
<td>111.0</td>
</tr>
<tr>
<td>Hydro</td>
<td>51.4</td>
<td></td>
<td>103.7</td>
<td>3.5</td>
<td>7.1</td>
<td>5.7</td>
<td>119.9</td>
</tr>
</tbody>
</table>


An example of industry locating a major data center close to a clean, renewable electric energy source is Google’s data center at The Dalles, Oregon. Google, that publicly stresses both energy efficiency of its infrastructure as well as high environmental standards, located one of its major US data centers in the Dalles [12, 13], Oregon, next to the Columbia River and close to a 2GW [14] hydroelectric power plant. In Europe, Google is currently building a data center in Hamina, Finland, [15, 16] that will use Baltic Sea water for cooling, enabling “free” cooling, i.e. cooling without chillers, and some wind power from a wind power farm being built next to the data center.
center. Furthermore, in Finland nuclear, hydro, wind, biomass and peat account for over 60% of electric energy generation whereas coal, oil and natural gas only account for less then 25% [17]. Another example of the use of cool “natural” water for “free” cooling is the Swiss National Supercomputing Centre’s (CSCS) new data center [18] under construction in Lugano, Switzerland, that will use water from Lake Lugano through 2.8 km long pipes for a 16 MW data center design. Another example of the use of free cooling is Stanford University’s planned new data center that is estimated to save $3M/yr compared to their current data center that use chillers [19]. Other examples of data center locations enabling low cost cooling through “free” cooling by using outside air and eliminating chillers are Microsoft’s data center in Dublin, Ireland [20] and Google’s data center in Belgium [21,22]. To reduce its environmental impact Google has also made major investments in renewable energy, such as wind, investing in two wind farms in North Dakota with a total of 170 MW capacity [23], purchasing 114 MW over a 20-year period from Iowa wind farms [24,25], and investing in wind power transmission infrastructure [26]. Other companies operating large data centers also consider both cost and environmental impact in locating their data centers.

2.2 Energy efficiency measures
The energy efficiency of data centers has been of great concern for major centers for close to a decade and significant improvements have been made in their design and operation. Low cost of the infrastructure and its operation is a major competitive advantage for Web and Internet companies such as Amazon, Google, Microsoft and Yahoo resulting in limited openness about center efficiencies and, in particular, how this efficiency is achieved. However, in recent years the secrecy has decreased.

A typical distribution of energy consumption in a traditional data center is illustrated in Figure 4 [27]. Some specific results are reported in the 2007 EPA report to the US Congress [28]. From Figure 4 it is apparent that high energy efficiency requires elimination of chillers and reduced losses in power conversion. To measure the effectiveness of energy use, the Green Grid introduced two measures for data centers: Power Usage Effectiveness (PUE) and its inverse Datacenter infrastructure Effectiveness (DCiE) [29] that are illustrated in Figure 5. State-of-the-art data centers today claim a PUE of about 1.2 [30, 31, 32], which in the case of [31] and [32] refer to chillerless data centers, which most likely also is the case for [30]. To reach efficiencies at the reported level it is clear that significant other improvements have been made.

![Figure 4. Electricity use in a traditional data center](image)

![Figure 5. PUE and DCiE](image)
2.3 Cooling

In regard to cooling, a key aspect is controlled airflow to minimize or completely prevent mixing of cold air entering servers and hot air leaving servers. Air cooling dominates today’s data centers and is an implicit assumption in Figures 4 and 5. A typical approach is to arrange computer racks such that the airflow forms alternating hot and cold aisles as illustrated in Figure 6 [33].

![Figure 6. Hot-aisle/cold-aisle arrangement of racks.](image)

The hot/cold aisle arrangement can be combined with enclosures to further assure separation of hot and cold air as shown in Figure 7 [34].

![Figure 7. Hot-aisle enclosure.](image)
This form of arrangement is often combined with in-row cooling for high heat densities as illustrated in Figure 8 [35].

The management of airflow for effective cooling is a strong contributor to the energy effectiveness of containerized data centers as illustrated in Figure 9 [36]. Practically all major vendors now have some form of containerized data center [37]. The idea of modularized (large) data centers originates from a need for rapid and cost effective deployment of large data centers with Google filing for a patent on containerized data centers in 2003 [38], Figure 10, and Microsoft discussing containerized centers in 2007 and 2008 [39,40] and showing one of their containerized centers in 2009 [41]. For an overview of data center trends see [42].

Figure 8. Hot-aisle enclosure with in-row cooling.

Figure 9. Data center in a container.

Figure 10. Google’s container data center concept [38].
Not all servers have a front-to-back airflow as assumed above. For server designs that have a sideways flow, such as the IBM Blue Gene, arrangements as in Figure 11 can be made to prevent mixing of cold and hot air [43].

Transverse flow is also planned for the next generation Cray systems that have a transverse flow across an entire rack row with temperature restoring water coils for each rack and blowers for each pair of racks to maintain temperature and speed of the transverse airflow as shown in Figure 12 [44]. The claim is that despite being open, this design will bring the PUE down to less than 1.05. A 6 kW blower cabinet serves two 100+kW cabinets.

Currently Cray systems are designed for a vertical airflow [45], as shown in Figure 13. Vertical airflow is also used by some other vendors having high-density solutions.
As evident from Figures 12 and 13 and the in-row cooling units in Figure 8, liquid cooling has moved closer to the racks, or even into the racks. An alternative to in-row cooling or top mounted cooling is cooled rack doors that restore the air temperature, mostly, to that close to the inlet temperature. The idea is illustrated in Figure 14 [46]. Several vendors have doors of this type with cooling capacity of up to 40kW or more depending on water temperatures and water flow. With water cooling at the rack level, claims are made that less than 2% energy is required for cooling of servers.

The evolution of cooling solutions is not only a reflection of a need for more energy efficient and environmentally friendly solutions, but also a consequence of increased heat densities.

Traditional data center designs had raised floors with open air flow and Computer Room Air Conditioning (CRAC) units along the walls as illustrated in Figure 15 [47].

This design was appropriate when heat densities were low. According to ASHRAE (American Society of Heating, Refrigerating and Air-Conditioning Engineers) [48], 20 years ago, densities of about 3 kW/m² was common for centers dominated by compute servers [49], but recently announced products [50, 51, 52] result in heat densities about 20 times that, as shown in Figure 16.
The increased heat densities are due in part to increased component heat densities, and in part to improved cooling techniques enabling increased packing densities and consequent increased heat densities. Figure 17 illustrates the CPU heat density trends that were dominate through the early part of the last decade at which point heat densities resulted in a cap on power dissipation so that for the last several years CPUs have largely been designed for a fixed maximum power dissipation. In fact, in recent years a range of x86 based CPUs have been introduced for lower power dissipation and clock frequencies.
The high heat densities of some components has lead to the introduction of component liquid cooling techniques, such as used, for instance, in IBM’s Power7 based servers shown in Figure 18 and Figure 19 [51].

Figure 17. CPU heat density evolution. Source Shekhar Borkar, Intel circa 2001.

Figure 18. An IBM Power7 water cooled 8 Tflop/s server unit (2U rack units high) with 8 Multi-Chip-Modules with a total of 256 cores.
Recently, liquid cooling in the form of liquid enclosed blades [53] has also been introduced as shown in Figure 20, or even entirely liquid filled racks, Figure 21 [54].

Operating temperatures have an impact on the energy consumption of data centers, though the relationship is not simple and component temperatures may affect reliability and longevity. ASHRAE has made thorough studies and recommendations for server inlet temperatures. The first set of recommendations of 25° C were made in 2004 then revised to 27° C in 2008 [55]. It has been claimed [56] that for every degree in increased set-point, an energy savings of 4% can...
be realized. In [57] it was shown that a reduction in cooling energy requirements of as much as 30% can result from raising the inlet temperature from 18°C to 27°C, but that the total net energy savings may be about 10% due to increased energy consumption for other systems in the data center including the computer system itself, see Figure 22.

As the temperature set-point is raised, the load on the server fans increases to assure that component temperatures stay below the target values. According to ASHRAE, when set-points increase over 25°C the fan energy consumption increases significantly. The study reported in [57] also noticed a significant increase in server energy consumption with increased temperature values even with constant fan speeds under high loads. It is interesting to note according to a recent study [58], that many data centers operate at significantly lower temperatures than ASHRAE recommends, see Figure 23.
Increased temperatures do not only reduce required cooling energy but also increase the potential for energy reuse. Typical return water from CRAC units is not warm enough for a variety of needs and hence may in fact represent more of a problem than an asset. However, direct cooling of components, as explored by IBM in a research project [59], would enable using cooling water with an inlet temperature as high as 60° C. The idea is illustrated in Figure 24.

The Green Grid is defining a new metric to account for energy reuse by introducing an Energy Reuse Factor, ERF, which is the fraction of energy used for the IT equipment that is being reused [60].

2.4 Power supply
The typical data center power supply structure is illustrated in Figure 25 which excludes a typical substation in which power supply voltage is stepped down to 480 or 400V 3-phase from several kV. However, common servers are designed to work with 12V DC internally. Thus, conversion from AC to DC as well as further reduction in voltage must take place.
disagreement about the potential gain [61, 62, 63, 64]. The estimated efficiency of a well-designed DC power distribution system is about 88%, Figure 26, including UPS (Uninterruptable Power Supplies), wiring losses and PSUs (Power Supply Units) [61]. At this level of efficiency DC distribution is estimated to offer an efficiency advantage over a well-designed and operated AC system of 2 – 5%, which is likely to be too small for DC to become the dominating data center power distribution method since such a conversion would require major investments and adoption of technologies that today do not have a broad market.

Table 26. Comparison of the energy efficiency of DC and AC power distribution in the data center [61].

<table>
<thead>
<tr>
<th></th>
<th>UPS</th>
<th>Distribution wiring</th>
<th>IT power supply</th>
<th>Overall efficiency</th>
</tr>
</thead>
<tbody>
<tr>
<td>DC</td>
<td>96.0%</td>
<td>X</td>
<td>99.5%</td>
<td>91.75%  = 87.64%</td>
</tr>
<tr>
<td>AC</td>
<td>96.2%</td>
<td>X</td>
<td>99.5%</td>
<td>90.25%  = 86.39%</td>
</tr>
</tbody>
</table>

UPS is used by many data centers to assure high availability. The most common UPS equipment uses batteries to supply power to the data center in case of a loss of external power. Thus, a conversion from AC to DC is required to keep batteries charged and a conversion from DC to AC required for the distribution of power in the data center when AC is used for this task. The efficiency of this double conversion has improved to up to 98% from a typical of around 80% several years ago. At this level of efficiency, battery-based UPS solutions are comparable to flywheels from an efficiency point of view, Figure 27 [65]. Even though the efficiency of state-of-the-art UPS is very high, some data centers do not use UPS, mostly for cost reasons. For instance, Google is reported to use batteries directly on their servers instead of UPS [66].

The 480V or 400V 3-phase power used for distribution in the data center is in most parts of the world routed directly to server racks, whereas in the US it is stepped down to a lower voltage in a Power Distribution Unit (PDU) incurring some losses [61].

Server Power Supply Units (PSUs) convert the AC used for distribution to DC used in the servers and also steps down the voltage to 12 V. The PSU efficiency has increased significantly in recent years from below 80% to 90% or better for a...
broad range of loads with peak efficiencies in excess of 94% for high quality PSUs [67,68] as seen in Figure 28.

<table>
<thead>
<tr>
<th>Power Supply type</th>
<th>Percent of efficiency @ 20% load</th>
<th>@ 50% load</th>
<th>@ 100% load</th>
<th>80 PLUS Certification</th>
</tr>
</thead>
<tbody>
<tr>
<td>460-watt</td>
<td>90.70%</td>
<td>93.20%</td>
<td>92.81%</td>
<td>Gold</td>
</tr>
<tr>
<td>750-watt</td>
<td>91.33%</td>
<td>94.58%</td>
<td>92.57%</td>
<td>Gold</td>
</tr>
<tr>
<td>1200-watt (AC)</td>
<td>86.84%</td>
<td>91.75%</td>
<td>91.19%</td>
<td>Silver</td>
</tr>
</tbody>
</table>

Figure 28. Power Supply Unit efficiencies from two vendors as certified by 80-Plus [69].

Most of the inefficiencies in the power supply chain within the data center have been eliminated in recent years and remaining inefficiencies are small. It is worth noting that at least one conversion is necessary since AC is used for electricity generation and distribution and DC used for electronics in servers. It is also the case that conversion is necessary for difference in voltage levels, and for “isolating” servers from the energy source.

2.5 Data center infrastructure efficiency summary

Over the last several years, data center design and operation have lead to an exceptional improvement in energy efficiency largely through improved cooling techniques and through improvements in the efficiencies of UPS and PSUs. Data center efficiency has also improved by locating new data centers where “free” cooling can be used and chillers eliminated. Reuse of energy consumed in the data center can contribute to a significant reduction in overall energy use and emissions, and can be an important consideration for data centers in areas where hot water can be an effective energy source. Increasing data center operating temperatures and in particular, use of direct cooling technologies enabling significantly raised outlet water temperatures, is of great interest and pursued by industry. In the planning of new data centers these issues should be considered together in a comprehensive way, as was done in the planning of the Computational Research and Theory Facility (CRTF) at University of California Berkeley [70].

With state-of-the-art data center PUE’s of 1.2 or less, there clearly are very limited energy efficiency gains possible from improved data center design and operation. Significant additional gains must come from energy reuse and improved energy efficiency of the IT equipment and its use. Though power consumption of data centers generally has increased substantially, the energy efficiency of computer systems measured in terms of work per energy unit has improved considerably for decades, largely due to Moore’s law, but also due to numerous innovations in many areas, including management and operations. Next we will review some of these changes.
3. HPC system energy efficiency

The exponentially improved performance of computers, usually referred to as Moore’s Law [71], is well known. The technology evolution has for the last few decades resulted in a halving of feature sizes about every 54 months, a doubling of transistors per processor every 21 months, i.e. more rapidly than what reduced feature sizes would predict, and a doubling in performance as measured by MIPS (Million Instructions Per Second) about every 20 months. For larger HPC systems the performance improvement as measured by the Linpack benchmark [72] has been even more rapid. From the plot in Figure 29, of the history of systems on the Top500 list [73], the list of the 500 most powerful computer systems in the world as measured by the Linpack benchmark, it can be deduced that the performance has doubled on average every 13.64 months for the number 1 system whereas for the number 500 system the doubling time on average is 12.90 months.

A number of studies have been carried out to attempt to assess the improved energy efficiency of computers. Figure 30 shows the findings by Koomey et al reported in [74] that extends the study by Nordhaus reported in [75]. The results indicate a doubling in the energy efficiency of computation about every 18.84 months. Note that the Top500 list measures performance of systems by the Linpack benchmark while Koomey and Nordhaus use a composite synthetic measure that includes both elements of the SPEC benchmarks [76] and other benchmarks as well as theoretical performance [75]. But, the rate of improvement in terms of energy efficiency should still be relevant for HPC systems since Nordhaus in [75] provide a (fixed) relation between floating-point performance and computations per second used in [75].

The difference in floating-point performance growth rate and improvement in energy efficiency of computation is a good indicator of the growth in energy consumption of HPC systems, which according to these observations would amount to about 20%/yr, or about a factor of 6 over a decade. This is higher than...
the EPA projected growth rate of 14%/yr on average for the 2000–2006 period, or 17%/yr average for volume servers in its report to Congress [28], Figure 31, but in line with the findings of the Uptime Institute [77, 78, 79]. The growth rate of about 20%/yr on average is also consistent with our experience at PDC at KTH [80] where we have had to expand the infrastructure from less than 400 kVA in 2003 to 2 MW in 2010. The factor of 6 growth over a decade in power for HPC systems is also consistent with the average power consumption for the Top50 systems on the Top500 list as presented in [81] for June 2000, about 230 kW, and in [82] for June 2010, 1401 kW. In connection with procurements in 2007/2008 we estimated that the capital and operating cost of the infrastructure for the lifetime of the procured systems would be about 1.5 times the cost of the hardware, which is in line with the predictions of Belady [8].

![Figure 31. Data center energy growth according to the EPA 2007 report to the US Congress [28].](image)

The reason for this very significant change is rapidly increasing energy costs, and the increased energy consumption of servers, as discussed above. For Sweden the electricity cost has increased about 7%/yr on average over the last 25 years while the US has experienced a lower growth rate. As can be seen from Figure 32 [83], the US price evolution has been highly variable and averages between 4% and 5% over the last 50 years for different consumer sectors. For the last five years, the price increase has been about 6% on average.
3.1 System architecture from an energy perspective
Koomey [74] and Nordhaus [75] have both reported a rapid improvement in the overall energy efficiency of computation. To understand both the past improvements in energy efficiency and future possibilities and challenges for HPC system energy efficiency improvements it is helpful both to understand the relative energy consumption of different parts of a system and the physics that governs the energy consumption of CMOS technology, the dominating technology today for CPUs and memory. It is also useful to understand the power management techniques introduced by vendors.

The energy consumption per transistor has improved by a factor of about 1 million over 30 years according to [84], as shown in Figure 33, which corresponds to a halving of energy consumption about every 18 months in line with the observations in [74].
As reported in [85], we recently designed a four socket blade server targeting energy efficiency based on 6-core high-efficiency CPUs resulting in a power distribution in the design stage as shown in Table 6 for a chassis of 10 blades. Since our design emphasized energy efficiency, our nodes are diskless which removes one source of energy consumption. For the estimates in Table 6, four DIMMs per CPU socket is assumed. In [86,87], subsystem power consumption is given for a two-socket server, but only two DIMMs per socket is assumed and the CPU power consumption seems exceptionally low given that High-Efficiency (HE) x86 CPUs typically have a peak power rating of about 80W and high-performance x86 CPUs have a peak power rating of 130 – 140W. The measured peak power consumption for a chassis of the servers we designed is about 4650W, or about 92% of the estimated peak power consumption. About 2/3rds of the peak power is consumed by CPUs and memory. The power distribution among subsystems for our design is fairly typical for current HPC servers. Though gains in energy efficiency is possible by reducing energy consumed by PSUs, fans, interconnect and motherboards, major improvements must address the energy efficiency of CPUs and memory.

Figure 33. Evolution of energy consumption per transistor [84].
The energy consumption of memory and CPUs depends on the feature sizes of the technology being used as indicated by Figure 33, but for any given feature size it also depends on operating voltage and frequency. For CMOS the relationship between power, voltage and frequency is

\[ P = c_1 V^2 f + c_2 V + c_3 + O(V^4) \]

where \( c_1, c_2 \) and \( c_3 \) are constants, \( V \) the supply voltage, and \( f \) the operating frequency. The first term represents dynamic power and is dominant in today’s CMOS; the second and third terms represent leakage and board power while the last term captures fan power. With the first term dominating, the power needed scales with the square of the voltage and the clock frequency. But, it is also the case that the frequency is fairly proportional to the voltage setting for normal operating conditions. Hence, in fact, the power is related to \( f^3 \). This relationship is exploited both in terms of controlling standard x86 CPUs, as illustrated in Figure 34 [88], and in design points.
for different CPUs. A typical load to power relationship is shown in Figure 35 [87] in which the disk power draw is independent of load and memory power consumption also fairly constant, except for the idle case. CPU power more than triples from idle to full load. But, idle power is nevertheless about half of power under full load. This is of great concern from an energy efficiency perspective for many usage scenarios, in particular for Internet and Web applications [89]. HPC systems often have workloads and queuing system assuring a sustained high load and hence idle power is not of major concern for HPC.

![Figure 35](image1.png)

Figure 35. Power consumption as a function of load on a typical server [87].

Another good illustration of the relationship between voltage, frequency, performance and power is shown in Figure 36 showing the characteristics of Intel’s 80-core experimental CPU [90].

![Figure 36](image2.png)

Figure 36. Power and performance relationship for the Intel Polaris research chip [90].
From these observations we conclude that minimizing execution time by maximizing execution rate may in fact be very energy inefficient because power needs increase more rapidly than execution times decrease. This is the premise on which multi-core chips are based, as seen in current CPUs that for a fixed power envelope and technology tend to have slower cores the more cores there are, as illustrated for the AMD Magny-Cours chips in Table 7. We also conclude that managing the state of the cores as a function of workload is important for overall energy efficiency. Further, we observe that the Intel 80-core experimental CPU operates in the same power range as a standard Intel CPU, which for 65 nm technology that was used for the 80-core chip had up to 4 cores. (Clovertown) [91], highlighting that the cores on the experimental chip are much simpler and smaller, 100 million transistors for 80 cores vs 582 million for the 4-core Clovertown [92]. Yet, for the Linpack benchmark the experimental chip achieves in excess of 1 TF compared to about 38 GF for the Clovertown chip. This illustrates the point that in regards to energy efficiency there is a possibility that simpler, lower power cores may be of great interest for HPC, as also discussed in [93] where it was shown that for some scientific application only 80 out of 300 x86 assembly language instructions were needed.

<table>
<thead>
<tr>
<th>Cores</th>
<th>Clock (GHz)</th>
<th>ADP Power (W)</th>
</tr>
</thead>
<tbody>
<tr>
<td>12</td>
<td>2.2</td>
<td>80</td>
</tr>
<tr>
<td>8</td>
<td>2.4</td>
<td>80</td>
</tr>
<tr>
<td>12</td>
<td>1.7</td>
<td>65</td>
</tr>
<tr>
<td>8</td>
<td>2.0</td>
<td>65</td>
</tr>
</tbody>
</table>

Table 7. Sample AMD multi-core CPUs.

3.2 Multi-core CPUs

The heat density of standard CPUs, as illustrated in Figure 17, forced commodity CPU vendors to seek new ways to exploit the continually increased capabilities offered by decreased feature sizes (“Moore’s law”). The exponential improvement is expected to continue through this decade [94]. The industry’s solution to exploit increased capability without increased power consumption was multi-core CPUs.

Technology demonstration systems based on dual-core AMD CPUs [95,96] and dual-core PowerPC CPUs [97,98] appeared in 2004. AMD, Intel [99], and IBM all delivered dual-core microprocessor CPUs for production systems in 2005. For more complex processors, IBM had already introduced dual core CPUs in 2001 [100] for their Power4 processors, Today, AMD offers CPUs with up to 12 cores with frequencies up to 2.3 GHz and a maximum power dissipation of about 137W while Intel offers CPUs with up to 8 cores. Intel’s 6-core CPUs have a maximum power dissipation of 130W and maximum clock frequency in turbo mode of 3.6 GHz.

Specialized CPUs, such as Graphics Processing Units (GPUs), today typically have hundreds of cores with, as examples, the nVidia Fermi GPU having 512 stream processor cores [101] with a maximum power consumption of 225W and peak theoretical double precision performance of 515 GF [102] and the AMD FireStream 9370 having 1,600 stream processor cores [103] with a maximum power consumption of 225W and a theoretical peak double precision performance of...
528GF. Though the power consumption of GPUs is about twice that of x86 architecture CPUs, or more, the peak double precision performance/W is about three times higher than that of the x86 CPUs.

Recently, in the quest for energy efficient servers, there has been an increased interest in processors used in the embedded and mobile markets, such as the ARM processors [104] that are widely used in the mobile market, the Intel Atom processor [105], and Digital Signal Processors, such as the Texas Instruments TMS320C6678 [106] capable of 40 GF in double precision at about 10W, which is still significantly less than the ClearSpeed CX700 Floating-Point Processor [107] that has about the same power consumption but a theoretical peak of 96 GF.

The possible improvement in energy efficiency of conventional CPUs is well demonstrated by the CPU designed for the IBM Blue Gene/Q for which little information is publicly available. However, from [73] it can be deduced that the CPU has an impressive energy efficiency with a theoretical peak performance of 204.8 GF at 1.6 GHz and an estimated power draw of about 50W. The processor characteristics are summarized in Table 8. Power and theoretical peak performance data in the Table are in some cases estimates, in other cases from public specifications. The intent is only to show relative qualities.

<table>
<thead>
<tr>
<th>ARM Coretx-9</th>
<th>ATOM</th>
<th>AMD 12-core</th>
<th>Intel 6-core</th>
<th>ATi 9370</th>
</tr>
</thead>
<tbody>
<tr>
<td>Cores</td>
<td>W</td>
<td>GF/W</td>
<td>Cores</td>
<td>W</td>
</tr>
<tr>
<td>4</td>
<td>~2</td>
<td>~0.5</td>
<td>2</td>
<td>2+</td>
</tr>
<tr>
<td>1600</td>
<td>225</td>
<td>~2.3</td>
<td>8</td>
<td>10</td>
</tr>
</tbody>
</table>

<table>
<thead>
<tr>
<th>nVidia Fermi</th>
<th>TMS320C6678</th>
<th>IBM BQC</th>
<th>ClearSpeed CX700</th>
</tr>
</thead>
<tbody>
<tr>
<td>Cores</td>
<td>W</td>
<td>GF/W</td>
<td>Cores</td>
</tr>
<tr>
<td>512</td>
<td>225</td>
<td>~2.3</td>
<td>8</td>
</tr>
</tbody>
</table>

Table 8. Estimates of theoretical performance/W for some processor alternatives.

From Table 8 we note that in terms of theoretical peak double-precision floating-point performance, the CPUs designed for mobile markets have a performance/W comparable to the x86 based CPUs that are designed with floating-point intensive applications in mind. The mobile CPUs however are expected to have advantages over x86 based CPUs for applications not dominated by floating-point operations and have more evolved power management features than x86 based systems. The benefits of such features are not captured in Table 8. The focus on low power in the design of mobile CPUs and their energy management features are the foundation for the current interest in servers based on low power processors [108,109], such as ARM [110, 111,112] and ATOM [113]. Servers accelerated with GPUs have received a great deal of interest in recent years as their performance has increased dramatically, in particular in terms of double precision floating-point, and programmability improved. However, integration into servers still is via the I/O bus (PCI Express) which can degrade possible application performance gains substantially. The ClearSpeed accelerator faces the same integration issues, though it fared better than GPUs in a study carried out in porting some benchmarks to accelerated systems [114]. The IBM BQC processor, for which not much information is available at this time, has an impressive energy efficiency and is likely to offer good performance and not require much effort in porting codes used on clusters, unlike porting of codes to accelerator based systems.
3.3 Energy Efficient HPC systems

Though energy efficiency at the CPU and server level has been a key consideration for component and system vendors for a good part of the last decade, the number of whole system design efforts focusing on energy efficiency has been few. However, a good example of the industry’s efforts at energy efficient HPC systems design is IBM’s Blue Gene series, starting with the Blue Gene/L (BG/L) introduced in 2004 after a five-year development effort, followed by the Blue Gene/P in 2007, and to be followed by the Blue Gene/Q in 2011 [115,116]. The BG/L was based on the dual-core PowerPC CPU. The BG/L not only set a record in terms of performance assuming the no. 1 position in the November 2004 Top550 list, but also in terms of energy efficiency. In the inaugural Green500 list in November 2007 [117] of the most energy efficient systems on the Top550 list, BG/L systems held positions 6 through 26 with positions 1 through 5 held by the second generation Blue Gene system, the BG/P introduced the same year. The Blue Gene/Q to be delivered in 2011 holds the no. 1 position in the most recent Green500 list [118] with an efficiency of 1,684 MF/W. The 2nd most energy efficient system on the November 2010 Top550 list used SPARC64 VIII CPUs with no accelerator and achieved an efficiency about half of the BG/Q, 829 MF/W, and position 4 on the list. The third most energy efficient system without accelerator used Intel 6-core CPUs and achieved 400 MF/W, about a quarter of the BG/Q, and position 18. The most energy efficient GPU accelerated system achieved an efficiency of 958 MF/W and the no. 2 position, while the most energy efficient system using the Cell Broadband Engine [119,120] for acceleration [121] achieved 773 MF/W and assumed position 5. On the June 2008 Green500 list Cell accelerated IBM systems occupied the three top positions with an efficiency of 488 MF/W. The top 10 positions on the November 2010 Green500 list are shown in Figure 37.

<table>
<thead>
<tr>
<th>Green500 Rank</th>
<th>MFLOPS/W</th>
<th>Site*</th>
<th>Computer*</th>
<th>Total Power (kW)</th>
</tr>
</thead>
<tbody>
<tr>
<td>1</td>
<td>1664.20</td>
<td>IBM Thomas J. Watson Research Center</td>
<td>NNSA/SC Blue Gene/Q Prototype</td>
<td>38.90</td>
</tr>
<tr>
<td>2</td>
<td>958.35</td>
<td>G5SC Center, Tokyo Institute of Technology</td>
<td>HP ProLiant SL190s G7 Xeon 6C X5670, Nvidia GPU, Linux/Winces</td>
<td>1243.80</td>
</tr>
<tr>
<td>3</td>
<td>933.06</td>
<td>NCSA</td>
<td>Hybrid Cluster Core 3 2.93GHz Dual Core, NVIDIA C2050, Infiniband</td>
<td>36.00</td>
</tr>
<tr>
<td>4</td>
<td>829.87</td>
<td>RIKEN Advanced Institute for Computational Science</td>
<td>K computer, SPARC64 VIIIfx 2.6GHz, Tofu interconnect</td>
<td>57.96</td>
</tr>
<tr>
<td>5</td>
<td>773.38</td>
<td>Forschungszentrum Jülich (FZJ)</td>
<td>QPACE SFB TR Cluster, PowerXCell 8i, 3.2 GHz, 3D-Torus</td>
<td>57.54</td>
</tr>
<tr>
<td>6</td>
<td>773.38</td>
<td>Universität Regensburg</td>
<td>QPACE SFB TR Cluster, PowerXCell 8i, 3.2 GHz, 3D-Torus</td>
<td>57.54</td>
</tr>
<tr>
<td>7</td>
<td>773.38</td>
<td>Universität Wuppertal</td>
<td>QPACE SFB TR Cluster, PowerXCell 8i, 3.2 GHz, 3D-Torus</td>
<td>57.54</td>
</tr>
<tr>
<td>8</td>
<td>740.78</td>
<td>Universität Frankfurt</td>
<td>Supermicro Cluster, QC Opteron 2.1 GHz, ATI Radeon GPU, Infiniband</td>
<td>385.00</td>
</tr>
<tr>
<td>9</td>
<td>677.12</td>
<td>Georgia Institute of Technology</td>
<td>HP ProLiant SL190s G7 Xeon 6C X5660 2.8GHz, nVidia Fermi, Infiniband QDR</td>
<td>94.40</td>
</tr>
<tr>
<td>10</td>
<td>636.36</td>
<td>National Institute for Environmental Studies</td>
<td>GOSAT Research Computation Facility, nvidia</td>
<td>117.15</td>
</tr>
</tbody>
</table>

Figure 37. The 10 most energy efficient systems on the Top500 list, November 2010 [118]
Another design targeting energy efficiency was SiCortex’s MIPS based systems (the company closed during the Spring of 2009), that used relatively slow cores, initially 500 MHz, later 700 MHz [122] (similar to the BG/L (700 MHz) and BG/P (850MHz at ~7W/core)). Another interesting system design is the proposed Lawrence Berkley National Laboratories Green Flash [123,124] proposed architecture based on Tensilica Xtensa CPUs (650 MHz at ~0.7W/core) [125,126]. A comparison of the expected energy efficiency with x86 and PowerPC based CPUs is shown in Table 9. As seen from the Table a factor of seven improved efficiency over using the PowerPC’s used in the BG/P and a factor of 60 over using an AMD Opteron processor is expected.

<table>
<thead>
<tr>
<th>Processor</th>
<th>Clock (GHz)</th>
<th>Peak/core (GF)</th>
<th>Cores/sockets</th>
<th>Sockets</th>
<th>Cores</th>
<th>Power (MW)</th>
<th>Cost 2008</th>
</tr>
</thead>
<tbody>
<tr>
<td>AMD Opteron</td>
<td>2.8</td>
<td>5.6</td>
<td>2</td>
<td>890k</td>
<td>1.7M</td>
<td>179</td>
<td>$1B+</td>
</tr>
<tr>
<td>IBM BG/P</td>
<td>0.850</td>
<td>3.4</td>
<td>4</td>
<td>740k</td>
<td>3.0M</td>
<td>20</td>
<td>$1B+</td>
</tr>
<tr>
<td>Green Flash/Tensilica Xtensa</td>
<td>0.650</td>
<td>2.7</td>
<td>32</td>
<td>120k</td>
<td>4.0M</td>
<td>3</td>
<td>$75M</td>
</tr>
</tbody>
</table>

Table 9. Comparisons of expected power consumption of a 200 PF system based on different CPUs [127].

3.4 Energy Efficiency HPC Systems Summary

Whereas for over a decade there was convergence in the design of HPC systems driven by the cost effectiveness of commodity technologies, the increased energy consumption and increased cost of energy has brought about a divergence in HPC architecture. Some form of acceleration is likely to become common, but in the near term it is hampered by the relatively poor integration into systems by the fact that accelerators typically use I/O bus technology for communication to their hosts. However, that might change in the not too distant future. Intel recently released a development platform with 32 vector cores per chip [128,129] with plans to release a follow-up product with 50 or more cores. It will also be interesting to follow the adoption (or not) of low energy CPUs for HPC systems for which some (many) features of today’s x86 CPUs are not needed, as well as how the opportunities for dynamic/application dependent power management of components and subsystems will evolve.

The challenges in building systems for the next major HPC performance target, Exa-scale systems, i.e. systems capable of $10^{18}$ operations per second, are significant not only in terms of the level of concurrency applications need to exhibit, but also how systems with tens of billions of threads will be managed, and how the energy issues will be resolved. Without a significant change in technology and possibly architecture, Exa-scale systems towards the end of the decade have been estimated to consume 70 MW [130] to 130 MW [131]. In estimates by Intel [132], the majority of the power consumption is expected to be due to memory and processor interconnection network. In Intel’s prediction, an improvement in the CPU energy efficiency to 100 GF/W by 2018 is assumed resulting in 10 MW of power for the CPUs of a system with a peak performance of 1 EF. This level of efficiency represents a 25-fold improvement over the IBM BQC efficiency. This level of improvement is in line with the historical trend [74], though significant technical challenges must be successfully addressed for the past trend to continue. For
memory, 40 MW is estimated for a 1 EF peak system assuming memory is integrated with the processors in a Memory In Processor (MIP) [133,134] architecture. For interconnect a power requirement of 5mW/Gbps is estimated with a target of 0.1byte/F 50MW required for the interconnect [132]. For the interconnection network this represents an improvement of about a factor of 10 compared to state-of-the-art today. Though a high-speed network clearly is needed, the extent to which network performance impacts application performance is subject to debate and also application as well as architecture dependent. In [135] it is shown that the nominally highest performing interconnection technology is not necessarily the most efficient, whereas in [136] the use of a low latency high-bandwidth interconnection network is shown to improve the energy efficiency by close to a factor of two, as shown in Figure 38. However, in this study no detailed energy measurements are reported and energy consumption assumed independent of the interconnection network used and proportional to run-time.

Energy consumption of an Exa-scale system is perceived as one of the most serious challenges in realizing such a system [137]. Because of the great challenges in regard to power consumption for future high-end systems and no clear pathway, approaches taken in markets with traditional high emphasis on energy efficiency are now being considered also for HPC. In [138] the impact of using approaches used in the embedded market including instruction set simplification and alternate memory system designs is discussed. The potential benefits of new architectures on the energy efficiency of computations, also using the embedded market as a starting-point, is illustrated in [139] in which it is highlighted that a typical laptop CPU requires about 4,000 times as much energy as an AISC for many operations whereas a Digital Signal Processor (DSP) may require about 250 times the energy of an ASIC for comparable operations. The architecture proposed in [139] is claimed to only consume energy that is a small multiple of an ASIC design.

4. System operations
Data center design and operations have evolved to a point where the inefficiencies in the infrastructure are relatively small and most potential energy gains from a facilities point of view are to be made by energy reuse. The component and platform industries have energy efficiency
and environmental impact as one of their foremost concerns, a concern that has driven architecture and system design for a few years. One aspect of this concern is the trend to increased ability to adjust the operating state of systems according to the workload resulting in technologies such as Intel’s SpeedStep [140] and AMD’s PowerNow [141]. Underlying these technologies is the ability to control the CPU operating conditions wholly or in part. The power management has been structured into power management through power planes with all parts in a power plane having a common power feed and voltage. Within a power domain operating conditions can still be controlled through what is commonly known as performance states. The Advanced Configuration and Power Interface (ACPI) [142] standard refers to these two aspects as C-states (processor power states) and P-states (processor performance states) respectively. The processor power states are characterized as one operating state (labeled C0) and a range of sleep states in which instructions are not executed. Depending on the parts being powered down restoring execution state will take different amounts of time. Sleep states do not include a state requiring reboot. Table 10 [143] illustrates the times to active state from a few sleep states on Intel mobile CPUs.

<table>
<thead>
<tr>
<th>C-State</th>
<th>Typical Worst Case Exit Latency Time</th>
</tr>
</thead>
<tbody>
<tr>
<td>C1</td>
<td>1µs</td>
</tr>
<tr>
<td>C3</td>
<td>80µs</td>
</tr>
<tr>
<td>C6</td>
<td>104µs</td>
</tr>
<tr>
<td>C7</td>
<td>109µs</td>
</tr>
</tbody>
</table>

Table 10. Time to active state from sleep state for some mobile Intel CPUs [143].

With the emergence of multi-core processors the number of power domains on a chip have been increasing, with Intel in its current generation six-core chips having a separate power domain for each core with the memory controller having its own power domain as well [144]. Within a power domain CPU manufacturers have chosen to implement different number of C-states, a number that tend to evolve with product generations. The significance of the sleep states in regards to energy consumption in idle mode is illustrated by Figure 39 showing the power consumption in a typical active state. In a sleep state allowing for a very quick recovery to active state, the power for logic and local clocks can be eliminated reducing the power consumption to typically less than 50% of active power. CPU states of “deeper sleep” may imply shutting down clock distribution and sections of logic for reduced leakage currents and hence enabling an inactive core to reduce its power consumption to a small fraction of its active power, Figure 40.

---

Figure 39. Typical power consumption for a core in active state [144].

Figure 40. Reduced power consumption using “deep” sleep power states on Intel CPUs [144].
With cores in active state the performance is controlled by altering the clock frequency of the cores with different cores possibly operating at different frequencies, a feature enabled on recent CPUs from both AMD and Intel. AMD currently supports five different P-states from a low frequency of 800 MHz up to the maximum for the product [145,146]. Intel supports an even larger number of P-states. The increased ability to control the power state and operating conditions on an increasing number of components on a chip also increases the complexity of control of the CPU leading to the introduction of a separate power control unit on recent CPUs [144], Figure 41. Controlling the power states of the caches represent its own set of challenges with AMD using its SmartFetch technology [141] to store L1 and L2 cache content in the L3 cache to enable powering down L1 and L2 caches. A similar approach is used by Intel on their Nehalem CPUs, but on its Westmere generation CPUs, Intel is reported to use a special SRAM for saving cache content enabling all three levels of cache to be powered down [147]. A summary of the power management features on the current generation AMD Opteron CPUs can be found at [146].

To stimulate research into multi-core chip technology including power management, Intel has produced the Single-chip Cloud Computer (SCC) [148,149,150] that has 24 dual-core processors (total 48 cores) in six power domains with one additional power domain for the on-chip interconnection network and routers, and another power domain for the remaining parts of the chip, i.e. eight power domains in total. Each dual-core processor has its own frequency control, but the cores on a processor do not have individual frequency control. Memory, I/O and on-chip networks have their own independent frequency control. In all there are 8 power domains and 28 frequency domains on the SCC, Figure 42.
Figure 43 shows the power consumption of the chip under light and high load [149]. The effectiveness of the core power management is apparent with the cores in low load state consuming less than 10% of their high load state. But, Figure 43 also shows the validity of the concerns about memory power consumption that in the high load state consumes about 20% of the power, but in the low load state consumes about 70% of the power. The power consumption by the memory only declines by about 20% from high to low load.

Figure 43. Power consumption of the Intel Single-chip Cloud Computer under high and low load. [149]
The interest in voltage and frequency scaling (DVFS) to gain energy efficiency is relatively recent for the HPC market, but is common practice in the mobile market. For HPC applications a 20 – 25% gain in power consumption for a 3 – 5% slow-down was reported in [151] using an automatic run-time procedure adjusting frequency to load. The tests were carried out on AMD CPUs. Similar predictions were made in [152]. Another test was carried out in [153] on an Intel Mobile CPU showing a potential energy saving of about a factor of 2.7 for about a 2% slowdown, Table 11.

<table>
<thead>
<tr>
<th>Core frequency (GHz)</th>
<th>T$_{\text{exec}}$ (s)</th>
<th>Energy (J)</th>
</tr>
</thead>
<tbody>
<tr>
<td>0.8</td>
<td>6.74</td>
<td>50.21</td>
</tr>
<tr>
<td>1.2</td>
<td>4.53</td>
<td>57.21</td>
</tr>
<tr>
<td>1.6</td>
<td>4.50</td>
<td>85.09</td>
</tr>
<tr>
<td>2.0</td>
<td>4.46</td>
<td>116.99</td>
</tr>
<tr>
<td>2.4</td>
<td>4.45</td>
<td>155.75</td>
</tr>
</tbody>
</table>

Table 11. Execution time and energy consumption for a sparse-matrix vector multiplication on an Intel T7700 laptop CPU [153].

In summary, load related power management can result in significant energy savings. CPUs for the HPC market have an increasingly rich set of control possibilities to adjust CPU behavior to the application demands. How to exploit these features has only received modest interest from the HPC research community. Initiatives such as Intel’s MIC [128] and SSC [148] will hopefully change that. A big problem though is the relatively large power consumption of memory and the still limited ability to control its power consumption. Memory is produced using the same basic technology as CPUs and hence the power and frequency scaling is similar. Low voltage DDR3 memory that operates at 1.35V instead of standard 1.5V is estimated to reduce memory power consumption by about 15 - 20% [154], Figure 44. Though the use of low-voltage DRAM will improve energy efficiency, a different memory architecture, or technologies will be necessary [130] as well as effective memory management from an energy perspective [89]. The difference in

Figure 44. The energy benefit of low voltage DDR3 memory (source Samsung) [154]
energy consumption by memory depending on its integration into the systems may be a factor of as much as 50 [138]. Embedded DRAM [155], eDRAM, used in many mobile devices but also in the IBM Power7 [156], may offer an energy savings of a factor of 2 – 4 or more [157]. An exciting development that seems to make good advances towards commercial reality [158] is that of the memristor [159,160] that has the potential to reduce memory power consumption, increase memory density up to 1 Tbit/cm², and speed by up to an order of magnitude.

The processor interconnection network though currently not a dominating energy consumer for HPC systems is predicted to become one. Today, like memory, interconnection networks are always on and the energy consumption fairly independent of load, Figure 45 [161]. However, the concern about energy efficiency of networks has also caused the network community to engage in work towards management of networks from an energy perspective seeking to make energy consumption related to usage either through rate adjustment or through introducing sleep modes [162].

![Figure 45. Ethernet power consumption in idle and active state. [161]](image-url)
5. Software
Finally, software at various levels has a big role in improving delivered energy efficiency. Some specific examples of the efficiency of scientific applications measured as percentage of peak floating-point performance, typically in the 5 – 30 % or so range, are found in [163,164]. A highly optimized newly developed code to achieve scalability received a Gordon Bell Award for performance at the SC10 conference with an efficiency of 34% [165]. If the application is memory bandwidth limited this may be acceptable from an energy efficiency point of view, in particular if the CPUs can be controlled to operate with reduced power. However, we suspect that in most cases poor scalability, poor match between chosen algorithms and the architecture, or simply codes not written for efficiency of resource use or energy efficiency may be the source of poor efficiency. Rewriting codes with energy efficiency in mind may result in a significant pay-off. As an example, improved software resulted in a server efficiency gain of 29% in 2009 at Akamai Technologies [166]. But, as discussed in [164] new execution models may be required for significantly improved energy efficiency in addition to choosing algorithms based on their possible implementations in energy efficient codes.

6. Summary
Data center energy demands have been rising much more rapidly than overall electric energy demands. In fact, the demands has risen with about 20%/yr on average for the last decade causing many existing data centers to either have to refurbish existing facilities or acquire, refurbish or build new data centers. The rapidly increased energy demands and associated cooling have made energy related capital and operating infrastructure cost exceed that of the IT equipment and led to large improvement in data center energy efficiency. The use of free cooling can significantly improve the energy efficiency by reducing the need for chillers, or entirely eliminating them, a fact that can both affect the location as well as design and operation of data centers. The efficiency of the power distribution system in data centers has also improved significantly with high quality server power supplies having an efficiency in excess of 90% for a broad range of loads and a peak efficiency of about 95%. High quality uninterruptible power supplies now reaches efficiencies of about 98% for load levels of 50% or more. But, in the case of HPC systems many centers do restrict the use of UPS to critical servers and networks and do not cover HPC systems by UPS. The improvements in data center design and operation, including raised inlet temperatures, has led to 80% or more of the energy being used by the IT systems.

Further improvement in energy efficiency will largely need to come from energy re-use and improved energy efficiency of the computer systems themselves. Energy re-use is a consideration in many data centers, in particular if there are nearby needs for energy, such as heating of buildings or for use in industrial processes. For reuse of energy in the form of hot water from data centers the warmer the water is the more useful it tend to be. The heat densities have brought liquid cooling into rack rows in the form of liquid cooled rack doors, or liquid cooling coils on top of racks or between racks, or direct liquid cooling of components or complete servers. Direct cooling has the possibility of generating the highest water temperatures of the liquid cooling options and hence the highest quality energy for reuse. The energy efficiency of CMOS continues to improve rapidly, but since not all logic is actively used all the time it is important to develop and use techniques to seek to make energy consumption related to the work carried out. Over the last few years CPU designs have included abilities to control power states as well as performance through control of clock frequencies as
means of making energy consumption increasingly related to workload. An increasing number of power domains on chips allow for independent control of many chip areas and an increasing number of power states enable different levels of (deep) sleep with corresponding savings in energy consumption. The control of clock rates, Dynamic Voltage and Frequency Scaling, enables optimization of energy consumption for workloads in that for some workloads, power consumption decreases more rapidly than execution time increases with reduced clock frequency and hence a lower clock rate would be beneficial from an energy point of view. At this time the operating system makes use of some of these features, firmware makes use of some others and few are accessible from applications. In addition to improved energy efficiency through controlling the CPU operating state, savings can also be made through simplified instruction sets reducing the complexity of the CPUs. Many instructions in the x86 instruction set are not used in a given application with a range of scientific applications using less than 30% of the instruction set.

The control of CPUs state from an energy perspective is the most advanced at this time. However, in recent years efforts have also been made to introduce load related control of interconnection networks, in particular for Ethernet, for which both sleep mode as well as load related data rates have been proposed. In the first standard for an energy efficient Ethernet that was approved September 30, 2010, 100 Mbps and 1 Gbps Ethernet chips are to transition into sleep mode when idle whereas for 10 Gbps chips a transition to lower data rates should take place under light or no load. The reason for not specifying a sleep mode for 10 Gbps is the potentially long time to return to active mode from sleep mode. The estimated savings across all uses of Ethernet had these energy efficient features of the standard been in place is 5 TWh/yr [167].

Memory is the second largest consumer of energy in most computer systems today and is expected to become the largest energy consuming subsystem in large computer systems of the future. At this time there are no dynamic control features for memory and the power consumption in idle mode is only up to 20% lower than in active mode. Dynamic control of memory, or significantly reduced power consumption for memory through the use of new technology is required for achieving significantly improved energy efficiency of computer systems. Energy efficiency has been the focus for a long time in the mobile device market, including a focus on energy efficient memory systems. Significant gains in energy efficiency are possible by a different memory system architecture bringing it closer to the CPU [139]. With improved tools for generating chip designs the advantages of domain specific designs may outweigh the increased costs due to limited volumes.

The largest potential in increased energy efficiency is in increased utilization of the hardware with many applications achieving efficiencies in the 5 – 30% range based on fraction of peak floating-point capability. Though this measure is questionable since the CPU is a diminishing part of both the system capital cost and energy use, it is likely in today’s systems that the fraction of peak memory bandwidth or network bandwidth is no higher. Hence, the opportunities for improved energy efficiency by improved algorithms and software, new architectures, and control of operating conditions are significant and need to be pursued vigorously.
References


16. Revealed: Google’s New Mega Data Center in Finland. September 15, 2010.  


   http://www.theregister.co.uk/2009/07/16/google_chillerless_data_center


   http://gigaom.com/cleantech/google-buys-wind-power-first-deal-for-google-energy

   http://www.washingtonpost.com/wp-dyn/content/article/2010/10/12/AR2010101202271.html


http://www.google.com/corporate/green/datacenters/measuring.html

31. HP’s Wynyard Data Center and its Unique Cooling Setup.  


http://searchdatacenter.techtarget.com/feature/How-do-I-cool-high-density-racks

34. Solutions – Hot Air Return.  
http://cold-aisle-containment.co.uk/solutions/hot-air-return/index.html


36. SGI. Next Generation Data Center Infrastructure. ICE Cube Modular Data Center. Overview and Features.  

http://www.datacentermap.com/blog/datacenter-container-55.html


http://netseminar.stanford.edu/seminars/10_25_07.ppt

40. R. Miller. December 2, 2008. Microsoft goes All-In on Container Data Centers. Data Center


http://www.apcmedia.com/salestools/NRAN-76TTJY_R2_EN.pdf


68. EPRI. 80 Plus Verification and Testing Report. 


http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=5624675&tag=1


http://www.netlib.org/benchmark/hpl

73. Top500 Supercomputer sites. www.top500.org


http://uptimeinstitute.org/content/view/155/147

78. B. Schultz. March 12, 2008. Data-Center Energy Consumption: Worse than we Thought?


80. PDC – Center for High Performance Computing. [www.pdc.kth.se](http://www.pdc.kth.se)


90. S. Vangal, J. Howard, G. Ruhl, S. Dighe, H. Wilson, J. Tschanz, D. Finan, P. Iyer, A. Singh,


eQPACE Architecture. [Website](http://www.fz-juelich.de/jsc/ juice/eQPACE_Meeting)

122. SiCortex. [Wiki](http://en.wikipedia.org/wiki/SiCortex)


http://www.cse.nd.edu/~kogge/reports.html

http://www.cse.nd.edu/~kogge/reports.html


http://www.ll.mit.edu/HPEC/agendas/proc09/Day1/S1_0955_Kogge_presentation.ppt

http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=5506076&tag=1

http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=4563875&tag=1


141. S.Troia. March 29, 2010. You down with AMD-P?

http://www.acpi.info/DOWNLOADS/ACPlspec40a.pdf


145. AMD. April 22, 2010. BIOS and Kernel Developer’s Guide (BKDG) For AMD Family 10h


