A zombie server, also known as a ghost server, is an unintentionally idle or nonfunctional computing asset that consumes resources, like power, cooling, and space. Typically, zombie servers aren’t apparent to data center managers. While they hide in plain sight, as many as 30% of servers in a data center may be zombie servers.
What Is the Cost of Zombie Servers?
Zombie servers can take a serious bite out of your data center's resources.
- Money lost. Zombie servers result in stranded capacity and unnecessary operational and capital expenses. These fees range from the operational cost of power sent to dormant servers to capital expenditures for new racks and servers (while entire cabinets of servers sit idling.)
- Energy inefficiency and carbon cost. When zombie servers populate data centers, energy is wasted on server power and cooling at the megawatt scale. This translates to megatons of carbon dioxide being released into the atmosphere each year.
- Security risks. Zombie servers significantly expand the attack surface area of a data center and can operate as an open doorway for malware to access other servers. They often go unnoticed and are not closely monitored, which can result in missed updates, old Oss, and missing patches that leave them vulnerable to cyberattacks.
How to Find Zombie Servers in Your Data Center
Collecting data is a key step in most investigations, and zombie hunting is no exception. A great place to start is power readings. A constant, low-power draw can indicate that a server is comatose. Conversely, erratic spikes in power consumption might raise flags for a compromised server running rogue applications. If available, CPU usage numbers can also provide insight into the activity levels of the suspected zombie servers.
Reviewing orchestration layer data can also be an invaluable point of comparison for your power and CPU observations. Analytics performed on orchestration layer data can provide insights into how much work a server is doing relative to its recorded power draw levels and CPU activity. If it isn’t accomplishing assigned tasks, but it is drawing excesses of power, it’s probably a zombie.
Leaning into a DevOps approach and tracking down the server owner might also be helpful. The server’s owners might have left the company or moved on to a new project without deactivating the server. Likewise, they might be able to provide an explanation as to why the server’s been left plugged in, if there is one.
Best Practices to Keep the Zombies at Bay
Following some simple data center management best practices can mitigate the risk of zombie servers.
- Document server operations. Knowing a server’s function at the time of installation can provide an invaluable baseline to refer back to when evaluating a server’s status.
- Set up alerts. Once a normal range for power consumption and CPU activity levels have been established, other triggers for investigation of a server can be set into place. When deciding what triggers should result in an alert, consider what patterns you’ve observed in confirmed zombie server cases.
- Advocate for efficiency. Server owners, managers, and decision-makers should be made aware of the environmental, financial, and security costs of zombie servers. By practicing advocacy, you can instill a motivation to act and prioritize data center efficiency.
- Create a protocol. Organizing protocols to monitor, label, and decommission suspected zombie servers can help to eliminate friction and obstacles on the path to data center optimization.
Easily Identify Potential Zombie Servers with DCIM Software
Zombie hunting is a lot of work. But the right Data Center Infrastructure Management (DCIM) software solution can make it a lot easier.
DCIM software can track granular power readings from intelligent rack PDUs to provide device-level power consumption data. Leveraging a built-in ghost server report generates a list of all the servers in your data center with a minimal power draw over time. These servers can be investigated further to confirm if they are in fact zombies. Adding up the total kWh of the servers in the report provides a quick estimate of how much energy and money can be saved by shutting them down.
Comprehensive DCIM software offers a detailed asset inventory complete with asset visualizations and standard and custom fields, so you can track everything you want about your equipment to better understand if they are zombies.
DCIM software offers a centralized platform to keep track of moves, adds, and changes. This makes it easy to investigate any work done on suspected zombie servers and, if required, request a server to be removed with detailed visual work orders.
Want to see how Sunbird makes it easy to find zombie servers? Get your free test drive now.