How IT and facilities managers team up for energy efficiency
How IT and facilities managers team up for energy efficiency
Data centers are very different from offices, and therefore impose very different requirements on a building management system. Yes, there is a need to control lighting, power, environmental conditions and security in a data center just as there is in an office building.
But in "lights out" data centers, the power must be clean and continuous, and temperature and humidity settings are determined by the equipment rather than human comfort. Perhaps the biggest difference, though, is that the building management system is operated by the Facilities department, while the data center is under the control of the IT department. This makes it difficult, for example, for the data center to participate in an organization’s demand response and energy conservation initiatives.
While we explore how IT and facilities departments can cooperate to better manage all of an organization’s energy consumption, it is necessary to introduce a type of building management system (BMS) purpose-built for the special needs of the data center: the data center infrastructure management (DCIM) system. Whereas office building management is focused on the working conditions created by the facility's heating, cooling and lighting, the DCIM system focuses on the building, power and cooling, and the IT/server capacity in an integrated fashion to optimize the building as application demand goes up and down.
There are several ways IT and facility managers can work together to minimize energy consumption in the data center whether the data center is collocated in an office building or has its own, dedicated facility. Among the most common initiatives are:
- Right-sizing the uninterruptible power supply (UPS) and power distribution equipment to minimize inefficiencies
- Making greater use of outside air and/or thermal storage for cooling
- Eliminating cooling inefficiencies and/or upgrading the computer room A/C system
- Adopting a hot/cold aisle configuration, and increasing cold aisle inlet temperatures to 80°F (27°C) as recommended by the American Society of Heating, Refrigerating and Air-Conditioning Engineers (ASHRAE)
Improvements in these areas can be measured by the DCIM system using the Power Usage Effectiveness (PUE) metric. PUE is the ratio of total power consumed and the power used by the IT equipment. Today's typical data center achieves a rating of about 1.8 to 1.89 according to the latest report from the Uptime Institute, which means that just a bit over half of the total power consumed is being used by the IT equipment (servers, storage and networking infrastructure), with the other half going mostly to the cooling and the inherent inefficiencies in power distribution systems.
The Environmental Protection Agency has established a target for data centers in the U.S. of a PUE rating between 1.1 and 1.4. The benefit in reaching this target range can be profound; for example, an improvement in PUE from 2.3 to 1.3 nearly doubles the power available for IT equipment, thereby extending the life of the data center.
Reducing the power consumed by the IT equipment has an obvious direct benefit, but also has an indirect benefit resulting from the need for less cooling. As mentioned above, the DCIM focuses on server capacity for two reasons.
The first is that, as both the U.S. Department of Energy and Gartner have observed, the cost to power a typical server over its useful life can now exceed the original capital expenditure. Gartner additionally notes that it can cost over $50,000 annually to power a single rack of servers.
The second reason is that total server capacity is established by the performance level required during the peak workload, which means the only thing the excess capacity is doing during off-peak periods (typically more than 80 percent of the time) is wasting energy. So there is considerable savings to be achieved.
There are three basic ways IT managers can reduce overall server power consumption:
- Refresh IT equipment with newer, more energy-efficient systems
- Consolidate and virtualize servers to improve overall utilization
- Match server capacity to the actually workload dynamically and turn off spare equipment during low utilization
An increasingly important aspect of power conservation efforts is the energy efficiency of the servers themselves, and the most efficient servers are those with the highest number of transactions per second per Watt (TPS/Watt). The PAR4 Efficiency Rating system used in the Underwriters Laboratories’ UL2640 standard is the most accurate means for IT managers to compare the transactional efficiency of legacy servers with newer ones, and newer models of servers with one another. Indeed, assessing the energy efficiency of servers should now be considered a best practice during every hardware refresh cycle and whenever adding capacity.
Because even the most energy-efficient servers continue to waste power when under-utilized, a consolidation and virtualization initiative is warranted. Virtualizing the servers can increase overall utilization from around 10% (typical of dedicated servers) to between 20 percent and 30 percent, and over 50 percent with more dynamic management systems. Successful consolidation and virtualization initiatives can also reclaim a considerable amount of rack space and stranded power, which also extends the life of a data center.
Even the best virtualized and most energy-efficient server configurations waste power during periods of low application demand. Total server power consumption can typically be reduced by up to 50 percent by matching online capacity (measured in cluster size) to actual load in real-time. Runbooks can be used to automate the steps involved in resizing clusters and/or de-/re-activating servers, whether on a predetermined schedule or dynamically in response to changing loads. Such dynamic or "stretchable" cluster configurations are more energy-proportionate to the workload, and are capable of achieving a server utilization rate in 70-80 percent range on servers that are "on" with a significant of servers being "off" during low utilization.
As the electrical grid becomes increasingly stressed and unstable, the frequency and duration of demand response (DR) events will also increase. Including the data center in the organization’s DR participation will require close cooperation between the facility and IT departments.
While the BMS may, for example, set a target reduction for the data center based on the severity and duration of the DR event, and the organization’s energy management objectives, the DCIM's system will extend this to the IT environment as well and will take into account the anticipated workload during the DR event and the service levels required for the various applications to make sure there is no reliability impact.
At a minimum, the DCIM system should be able to power-cap less critical servers, and power-down some others that are not needed to satisfy the current workload. Depending on the event's duration, the DCIM might also be able to temporarily turn up or down the thermostat, taking action as necessary to prevent "hot spots" from forming or reducing the cooling power consumption using pre-cooled chilled water.
An even better approach involves being proactive about DR events, which inevitably occur in the afternoon and evening on hot summer days. The ability of the DCIM system to dynamically shed and shift loads within a single data center also works across multiple data centers, which most organization already have for business continuity purposes. Because rates for electricity are their lowest at night when demand is low and baseload generation is under-utilized, shifting the current workload to "follow the moon" can result in considerable savings.
The savings from following the moon are even greater because outside ambient air temperatures are also at their lowest at night, which can substantially cut cooling costs. In fact, a well-designed data center operated exclusively at night during a finite "shift" might not even need a power-hungry air-conditioning system, and the IT department can be excused from participating in any more DR events.