top of page

Blog & News

Welcome to Alucid's News section.  Stay up to date on types of projects we are working on, community events, and general industry news.  Always feel free to reach out to us on one of our social platforms. 

  • LinkedIn
  • X
  • Instagram
  • YouTube Social  Icon

3 Tips - Ensure Data Center Uptime

Writer's picture: Alucid TeamAlucid Team
  1. Over-Analyze

  2. Prioritize Operational efficiency

  3. Keep it simple



I heard Elon Musk say once that his biggest challenge is creating a corrective feedback loop, and then maintaining that feedback loop for himself over time. It is extremely difficult as a human being to consistently critique oneself or receive feedback that is not what you want to hear. So, I will start by saying, try to find at least twice a year to check in with yourself.


As a Technology Executive, it is easy to overlook a lot of variables when doing your data center evaluation. That said, it is also very easy to over-analyze every little aspect of your critical environment and the risk factors associated with it.

The later is exactly what I want you to do - over-analyze your environment and understand every risk factor that could cause your business to go down. The key is to not let your evaluation overwhelm you. Once you have compiled a comprehensive list of potential things you can do to harden your resiliency, sit down and prioritize what you need to get done based on:


1. Budget

2. Highest Risk factors, and

3. What enhances your team's operational efficiency.


Too often I find teams that build in process or standard procedures simply for the sake of having it, or because an auditing body 'recommends' having it. Doing unnecessary activities in any business is a waste, and that is certainly true for lean IT organizations. I want you to re-read that... I am recommending you do not do any activities unless they are necessary . I am not saying, "do not add helpful or useful processes."


For example, I recently visited a customer's facility to do a free data center evaluation. The 3 man team walked me through their operations and I was seriously impressed with how much they did for such a small organization. When they walked me through their preventative maintenance practices I was taken aback. This team was polishing the diesel fuel in their generator every year, yet only did an annual inspection on their UPS system - no load testing, no transfer switch, circuit breaker, or maintenance bypass testing, no measurements of current or voltage.


Let's get back to the above - prioritize highest risk factors and what enhances your team's operational efficiency. The #1 cause of data center outage is power failure due to UPS failure. The second, network failure due to DDOS attacks - not your generator. UPS preventative maintenance should be VERY high on your Quarterly and Annual to-do list. This seems simple, but so often do I find teams work on "what they heard someone else is doing", "what they can get done quickly", or "what can get them through until next year." When you couple this with the sense that every year you need to improve your team's operations, you tend to end up with a lot of wasted effort. So I leave you with this - Keep it simple.


If there is one procedure I ask you to add to your operations, it is to take one day a year to reviewing your team's list of processes and simplifying them. Here are three questions to ask your team:

  1. Is this still important for us to do?

  2. Are there alternative methods of accomplishing the same thing?

  3. Is the time spent on this operation of greater value than time spent performing another?


Having outside perspective helps create that feedback loop. It helps you identify when you are doing too much (or too little). It helps you reorient to what you set out to accomplish. It can also help you reaffirm that what you are doing is right. There is tremendous value in each of these.


Comments


© 2018 - 2025 by Alucid LLC

bottom of page