Data centers are significant consumers of energy and it is also increasingly clear that there is much room for improved energy usage. However, there does seem to be a preference for focusing on energy efficiency rather than energy elasticity. Energy elasticity is the degree in which energy consumption is changing when the workload to be processed is changing. For example, IT infrastructure which has a high degree of energy elasticity is one characterised by consuming significantly less power when it’s idle compared to when it’s running at its maximum processing potential. Conversely, an IT infrastructure which has a low degree of energy elasticity consumes almost the same amount of electricity whether it’s in use or idle. We can use this simple equation:
Elasticity = (% change in workload / % change in energy usage)
If elasticity is greater than or equal to one, the curve is considered to be elastic. If it is less than one, the curve is said to be inelastic.
Given the fact that it isn’t unusual that servers operating under the ten per cent average utilization and most servers don’t have a high energy elasticity (According to IDC, a server operating at 10% utilization still consumes the same power and cooling as a server operating at 75% utilization) it is worthwhile to focus more on energy elasticity. A picture can say more than words so this energy elasticity issue is very good visualized in a presentation of Clemens Pfeiffer CTO of Power Assure, at the NASA IT Summit 2010. As you can see without optimization, energy elasticity, power consumption is indifferent to changes in application load.
Barroso and Holzle of Google have made the case for energy proportional (energy elastic) computing based on the observation that servers in data centers to-day operate at well below peak load levels on an average. According to them energy-efficiency characteristics is primarily the responsibility of component and system designers, ”They should aim to develop machines that consume energy in proportion to the amount of work performed”. A popular technique for delivering someway of energy proportional behavior in servers right now is consolidation using virtualization. By abstracting your application from the hardware, you could shift things across a data center dynamically. These techniques
- utilize heterogeneity to select the most power-efficient servers at any given time
- utilize live Virtual Machine (VM) migration to vary the number of active servers in response to workload variation
- provide control over power consumption by allowing the number of active servers to be increased or decreased one at a time.
Although servers are the biggest consumers of energy, storage and network devices are also consumers. In the EPA Report to Congress on Server and Data Center Energy Efficiency is suggested that, servers will on average account for about 75 percent of total IT equipment energy use, storage devices will account for around 15 percent, and network equipment will account for around 10 percent. For storage and network devices energy elasticity is also a relevant issue.
Organizations have increased demand for storing digital data, both in terms of amount and duration due to new and existing applications and to regulations. As stated in a research of Florida University and IBM it is expected that storage energy consumption will continue to increase in the future as data volumes grow and disk performance and capacity scaling slow:
- storage capacity per drive is increasing more slowly, which will force the acquisition of more drives to accommodate growing capacity requirements
- performance improvements per drive have not and will not keep pace with capacity improvements.
Storage will therefore consuming an increasing percentage of the energy that is being used by the IT infrastructure. Of the data set that is being stored only a small set is active. So it is the same story as for the servers, on an average storage operate at well below peak load levels. A potential energy reduction of 40-75% by using a energy proportional system is claimed. According to the same research there are some storage energy saving techniques available:
- Consolidation: Aggregation of data into fewer storage devices whenever performance requirements permit.
- Tiering/Migration: Placement/movement of data into storage devices that best fit its performance requirements
- Write off-loading: Diversion of newly written data to enable spinning down disks for longer periods
- Adaptive seek speeds: Allow trading off performance for power reduction by slowing the seek and waiting an additional rotational delay before servicing the I/O.
- Workload shaping: Batching I/O requests to allow hard disks to enter low power modes for extended periods, or to allow workload mix optimizations .
- Opportunistic spindown: Spinning down hard disks when idle for a given period.
- Spindown/MAID: Maintaining disks with unused data spundown most of the time.
- Dedup/compression: storing smaller amounts of data using very efficient
Storage virtualization can also help but component and system designers should aim to develop machines that consume energy in proportion to the amount of work performed. There is still a way to go to get energy elastic storage.
According to a paper of the USENIX conference NSDI’10 “today’s network elements are also not energy proportional: fixed overheads such as fans, switch chips, and transceivers waste power at low loads. Even though the traffic varies significantly with time, the rack and aggregation switches associated with these servers draw constant power.” And again the same recipe dooms up, component and system designers should aim to develop machines that consume energy in proportion to the amount of work performed. On the other hand, as explained in the paper, some kind of network optimizer must monitor traffic requirements. Choosing and adjusting the network components to meet those energy, performance and fault tolerance requirements and powers down as many unneeded links and switches as possible. In this way, on average, savings of 25-40% of the network energy in data centers is claimed.
Making servers, storage and the network in data centers energy-proportional we will also need to take air-conditioning and cooling needs into account. Fluctuations in energy usage is equivalent to fluctuations in warmth, and the question is if air-conditioning can be quickly zoned up and down to cool the particular data center zones that see increased server, storage or network use. As Dave Craven of Spinwave Systems, stated in a recent editorial article of the Processor “Unfortunately, the mechanical systems used to cool and ventilate large data centers haven’t kept up with technological advances seen in the IT world”. “Many buildings where they are putting newer technology and processes are still being heated and cooled by processes designed 20 years ago” Craven adds to this. Given the fact that the PUE is driven by the cooling efficiency (see for example the white paper of Trendpoint) it looks like cooling is the weak spot to create an energy elastic data center.
The idea of ‘disabling’ critical infrastructure components in data centers has been considered taboo. Any dynamic energy management system that attempts to achieve energy elasticity (proportionality) by powering off a subset of idle components must demonstrate that the active components can still meet the current offered load, as well for a rapid inactive-to-active mode transition and/or can meet changing load in the immediate future. The power savings must be worthwhile, performance effects must be minimal, and fault tolerance must not be sacrificed.
Energy management has emerged as one of the most significant challenges faced by data center operators. Defining this energy management control knob to tune between energy efficiency, performance, and fault tolerance, must come from a combination of improved components and improved component management. The data center is a dynamic complex system with a lot of interdependencies. Managing, orchestrating, these kinds of systems ask for sophisticated math models and software that uses algorithms to automatically make the necessary adjustments in the system.