These can include the overall or timed application uptime and downtime, the number of transactions completed, application responsiveness, errors and other availability-related metrics. Application scalability, reliability, recoverability and fault tolerance may also be considered when weighing application availability. Of these, the ones that IT teams typically care most about — especially as they relate to system performance — are availability and reliability. The actual percentage measured for system or component availability can only be determined to be good or bad by comparing it with established service-level objectives .


Reliability and availability go hand in hand as one is not possible without the other. Keep in mind that, although these formulas look simple, properly defining what failure is for your system is the difficult part. Since reliability is based on user experience, you need to use processes such as SLIs to understand what an acceptable service level is. Availability refers to the percentage of time a system is available to users.

The scheduled operation time is the total period when the asset is expected to perform work. The scheduled operation time excludes idle time (i.e. the time when an asset is not scheduled to operate). Ideally, assets should have as close to 100% availability as possible. While system-level availability is a critical measurement for operational business functions, the availability of key system components should also be consistently measured.


POWER7 System RAS Key Aspects of Power Systems Reliability, Availability, and Serviceability. Itanium Reliability, Availability and Serviceability Features Overview of RAS features in general and specific features of the Itanium processor. Virtual machines to decrease the severity of operating system software faults. Partitioning/domaining of computer components to allow one large system to act as several smaller systems. Predictive failure analysis to predict which intermittent correctable errors will lead eventually to hard non-correctable errors. Permanent faults lead to a continuing error and are typically due to some physical failure such as metal electromigration or dielectric breakdown.

What does system availability mean for maintenance?

Communicating with operations crews to have assets offline when needed. The difference between these measures allows for different perspectives on a plant’s ability to perform. The distinct importance of one from the other is shown by their individual definitions. Courses Free asset and maintenance management courses to help you thrive. Edge Connected and secure loT sensors for real-time remote condition asset monitoringDataHub The only purpose built Asset Data Platform. Asset Focused ETL Solution for advanced analytics and integrated, real-time asset data.IoT Sensors UpKeep’s wireless sensors work out of the box and range from temperature to vibration sensors.

In many situations, the reason for the failure could have been identified beforehand as a risk and addressed accordingly. It’s easy to see which type of downtime is causing an issue with availability. A learning management system is a software application or web-based technology used to plan, implement and assess a specific …

  • This is the ability to hot swap components or peripherals, making upgrades and repairs easier.
  • Alternatively, availability can be defined as the duration of time that a plant or particular equipment is able to perform its intended tasks.
  • At all times, information must be available to those with clearance.
  • To improve availability, organizations generally use replication techniques that create redundant data copies to enable continuous data access.
  • Although availability status can change over time, there is no such thing as varying degrees of availability.
Maintenance managers rely on this availability to determine how effectively existing maintenance strategies, activities, and schedules are maintaining uptime. Because availability, maintainability and reliability each measure different aspects of a system’s status, putting them together is a useful means of gaining insight into the overall reliability of a system. Availability, maintainability and reliability all have distinct—if related—meanings, and they each play different roles in reliability operations.

Why Are Availability and Reliability Crucial?

Manufacturers, warehousers, and oil/gas providers are some of the providers most likely to track this key performance indicator . As mentioned before, the definition of availability calculates the probability that an asset will be available when needed for production. Also referred to as equipment or asset availability, this availability measure is an essential metric for organizations that depend on complex pieces of equipment to function. A system that one SRE considers easy to maintain could seem difficult to maintain to another SRE. In contrast, an application that responds to almost all requests but that suffers from high latency or error rates would not be very reliable, although it might be highly available. Be constantly on the lookout for ways to streamline your maintenance and production processes, improve quality, and eliminate defects.


Table 1 shows some examples of system availability and hours in down time per system per year. Maintenance management methods, standard operating procedures , and maintenance tools all influence system availability. The KPI helps organizations gauge how well they maintain their tangible assets necessary for meeting high production standards.

What is the availability requirement?

If a buggy application release can be quickly fixed by rolling back to a stable version, the application would have a high degree of maintainability. On the other hand, if you have a server that needs to be rebuilt manually after it fails, it’s not very maintainable. As you work on improving reliability in your facility, don’t stop after each step. It’s a continuous process, and you’ll need to keep working to improve upon each new procedure, practice, and task you implement. To achieve world-class reliability in your facility, it’s not enough to just keep equipment up and running as much as possible.


