Data center uptime is the guaranteed amount of availability of a data center, measured annually. In order to keep up with modern business demands, it is important for data centers to achieve impressive levels of uptime. To do so, data center managers pay close attention to facility infrastructure issues. Data centers are typically categorized by their uptime in the following four tiers:
- Tier 1. Tier 1 data centers are generally used by small businesses and feature 99.671% uptime, no redundancy, and 28.8 hours of annual downtime.
- Tier 2. Tier 2 data centers are typically used by small and medium sized businesses as there is a greater guarantee of efficiency and redundancy. These data centers have no more than 22 hours of annual downtime and 99.741% uptime.
- Tier 3. Tier 3 data centers are used by larger and more sophisticated companies because these facilities do not require a complete shutdown to replace or maintain equipment, and any component can be shut down without impacting service. Tier 3 data centers provide at least 72 hours of backup power, have no more than 1.6 hours of annual downtime, and 99.982% uptime.
- Tier 4. Tier 4 data centers include multiple independent and isolated systems that serve as redundant capacity components and distribution paths to prevent an event from impacting several systems. This tier is reserved for enterprise-level customers who pay premiums to ensure the safe and efficient operation of their equipment. At this tier, there are no more than 26.3 minutes of downtime per year, 99.995% uptime, and 96 hours of outage protection.
What Impacts Data Center Uptime?
- Human error. Human error is inevitable, especially as data centers need constant maintenance, repairing, monitoring, and testing in order to function at their highest potential. Downtime can be the result of a breakdown in processes or a fault in human execution. Without the right tools, human error can be difficult to prevent, but if it is recognized quickly, the issue can be resolved and uptime can be restored.
- System failure. System failure is typically caused by using inadequate monitoring tools or none at all. Without proper data center monitoring practices, a multitude of scenarios can easily cause downtime such as cabinets drawing too much power and tripping a breaker, three-phase imbalance, extreme temperature or humidity levels, water leaks, and more.
- Natural disasters. Natural disasters such as hurricanes and earthquakes can have a significant impact on the performance of a data center, and therefore impact uptime. For instance, a power outage caused by a hurricane may halt a data center’s operations if not resolved quickly. To prevent extensive issues from natural disasters, organizations should have a detailed disaster recovery plan in place.
Why is Data Center Uptime Important?
- Expenses. During data center downtime, organizations can suffer major losses as the data center is unable to complete all necessary tasks. The costs of downtime can be directly attributed to lost sales for organizations that do business online, poor brand reputation causing loss of customers, reduced productivity, SLA payouts, and lost data.
- Security. Downtime increases the risk of a data center breach from hackers or other cyber attacks as there is minimal monitoring during downtime. The monitoring that occurs during uptime decreases the risk of a security breach and therefore avoids costs typically associated with a crisis response plan.
- Customer satisfaction. System downtime may prevent customers from being able to access an organization’s services when needed. This can lead to a decrease in customer satisfaction, and customers may opt to use another organization’s data center services if it has more uptime.
Maintain Uptime with DCIM Software
According to Gartner, data center downtime costs $5,600 on average. This results in average costs between $140,000 and $540,0000 per hour depending on the organization. With the potential damages being so high, data center managers deploy Data Center Infrastructure Management (DCIM) software to improve uptime.
DCIM software prevents human error and allows you to maintain uptime via:
- Health polling. Receive an immediate alert that a device is down so you can quickly react and get back to service before there is an issue.
- Thresholds. Set warning and critical thresholds and power and environmental data, and then leverage an enterprise health dashboard to easily spot threshold violations.
- Trend charts. Monitor trends over time to be proactive and react before you have a threshold violation and potential incident.
- ASHRAE cooling charts. Ensure your equipment meets ASHRAE recommendations with psychrometric cooling charts that show you which devices are operating outside of standard conditions.
- Thermal map time-lapse videos. Visualize temperature sensor readings over time to quickly identify and eliminate hot spots that can damage equipment.
- Cabinet capacity and redundancy monitoring. Create a daily report that highlights rack that are low on available capacity and are dangerously close to being outside of your redundancy requirements.
- Power monitoring. DCIM software automatically tracks the power at each breaker connection to ensure ratings are not exceeded. With live readings from inlet or outlet meters, it will prevent you from applying a load that will exceed breaker limits.
- Three-phase load balancing. Unbalanced power can lead to premature breaker trips and high voltages that reduce the useful life of equipment. Maintain three-phase load balance with DCIM software that alerts you when a device is in violation.
- Failover reports. Simulate failover scenarios and test what-if scenarios to ensure that power is always available to IT equipment.
Want to see how Sunbird’s world-leading DCIM software makes it easy for you to maintain uptime? Get your free test drive now!
Related Images/Videos
Related Links
- 10 Best Practices to Improve Data Center Uptime
- Understanding the Cost of Data Center Downtime
- Data Center Monitoring: Effective Ways to Mitigate Human Error and Reduce Downtime
- Data Center Headaches: Monitoring Power and Cooling to Reduce Downtime
- Best Practices to Increase Uptime and Capacity Utilization with Busway Monitoring
- Data Center Power Monitoring Software