Transparent service level agreement policy
At OnCloud we like to be transparent, we know that providing accurate and timely information allows businesses to grow and take action on any eventuality or contingency that could happen. That is why we’re updating our level of service agreement on this page every month so that you can check us and give us your vote of confidence. In the Incident Log and on the OnCloud twitter ( @oncloudmx ) we will publish if we are having any problem.
The current SLA is 99.9691692777376% and is updated every hour. If everything works fine the level rises but if something happen, it decreases.
|Year||Percentage of availability||Fallen minutes||Minutes during the year|
|Year||Month||Electric Power||Servers||WAN (Internet)||LAN||Hybrid Storage||Flash Storage||Fallen minutes||Percentage of availability|
OnCloud Committed Service Level Agreement
| ||Servers||Electric Power||WAN (Internet)||LAN||Storage||OnCloud SLA|
|Time out time per day||1.44 minute||8.64 seconds||8.64 minute||43.2 seconds||1.44 minute||2.47 minute|
|Time out per week||10.08 minute||60.48 seconds||1.0008 hours||5.04 minute||10.08 minute||17.3376 minute|
|Time out per month||40.32 minute||4.032 minute||4.032 hours||20.16 minute||40.32 minute||1.15584 hours|
|Time out per year||8.064 hours||48.384 minute||48.384 hours||4.032 hours||8.064 hours||13.87 hours|
Metodología para calcular el nivel de servicio
What is the SLA and how is it calculated?
SLA (Service Level Agreement) or Committed Service Level Agreement is a written contract between a service provider and its client in order to set the agreed level for the quality of the expected service. The SLA is a tool that helps both parties to reach a consensus in terms of the level of quality of service.
In OnCloud the SLA is divided into 6 parts:
- SLA Telcom: WAN Wide area network or wide area network.
- SLA Energía: Redundant Electric Power.
- SLA Communications: LAN Local area network or local area network.
- SLA Servers: virtual machine monitor.
- SLA Hybrid Shared Storage: where all the data resides.
- SLA Shared Flash Storage: where all the data resides.
How do we calculate our SLA in real time?
The formula that we use to calculate the SLA in real time is the sum of the downtime between the minutes that have passed since August 8, 2014 to today.
SLA = 100 - ((100 * (Sum (timeout)) / (minutes since August 1, 2014))
How do we calculate our SLA per Year?
The formula that we use to calculate the SLA year is the sum of the Down Time for each year between the minutes that have passed in each year.
SLA = 100 - ((100 * (Sum (downtime per year)) / (minutes in year))
How do we calculate our SLA per month?
We use the same formula as in the SLA per year, we only group it by month
SLA = 100 - ((100 * (Sum (downtime per month)) / (minutes per month))
|Date and hour||Area||Severity||Event duration in minutes||Description|
|2020-06-03 21:05:00||Wan||Alert||5||Internet link intermittence.|
|2020-04-04 00:40:00||Servers||Partial Error||120||Due to a failure in a virtual cluster, the high availability mechanisms were not enabled, manual intervention was required to solve the problem and leave the affected services operational.|
|2020-03-26 00:00:00||Servers||Partial Error||120||Hardware failure in a server that activated the high availability mechanisms and restarted the virtual servers in another host available in the OnCloud infrastructure.|
|2020-03-14 09:27:00||Servers||Partial Error||30||Hardware failure in a server that activated the high availability mechanisms and restarted the virtual servers in another host available in the OnCloud infrastructure.|
|2019-11-20 06:42:00||Servers||Alert||18||Hardware failure in Server, activating the High Availability mechanism.|
|2019-08-26 11:22:00||Servers||Alert||15||One of our computer equipment presented a hardware failure so the high availability mechanisms were activated. Currently the equipment is operating normally and monitoring will be maintained in order to identify possible failures derived from the incident.|
|2018-02-02 16:00:00||LAN||Information||2||LAN convergence.|
|2017-12-09 23:31:00||Hybrid Storage||Partial Error||8||There is a failure in the storage and redundancy is activated.|
|2017-10-11 04:08:00||WAN||Information||1||One of the links presented a switchover without affecting customers.|
|2016-08-02 22:15:00||Wan||Alert||5||A mandatory window was programmed to one of the firewalls to avoid incidents and improve the security of the environment.|
|2016-05-16 07:35:00||Hybrid Storage||Partial Error||42||An inconsistency was detected in one of the mechanical Datastores. ===== Update 7:47 am ==== The problem is in the Snapshots module that is duplicating the job filling the Datastore faster than it frees it. The VMs that are in this datastore are not in operation. ===== Update 7:56 am ===== The problem has been solved, customers begin to report that their service is operating. ===== Update 8:09 am ===== All clients are operational. A diagnosis and verification has been run in the datastore and everything went satisfactorily.|
|2016-05-06 09:31:00||Wan||Partial Error||96||The Core switch of one of the internet providers failed.|
|2016-04-26 15:47:00||Lan||Partial Error||47||A local network equipment (Switch) presented a failure and had to be replaced, this affected some servers. Actions will be taken to partition into smaller regions and avoid problems like the one that was presented. Currently all services are operating normally.|
|2016-04-22 20:12:00||Wan||Alert||365||One of the internet links suffered a cut in its fiber. This problem only affects clients who have contracted special links.|
|2016-02-19 23:18:00||Lan||Partial Error||6||As reported in the last report, the equipment that was not working correctly has just been changed. All the services are operating correctly and we have run all the diagnostic tests.|
|2016-02-19 18:57:00||Wan||Information||0||One of the perimentral network equipment is faulty. It still does not affect any client so we will replace at 9:30 pm to avoid a surprise time out.|
|2016-02-18 22:55:00||Wan||Partial Error||1||As reported 24 ago, one of the network equipment was updated to avoid vulnerabilities.|
|2016-02-18 10:43:00||Wan||Alert||0||A review has been opened in the perimeter network equipment ensuring access to services. The idea is to continue improving to avoid surprise drops in service. No impact on our customers.|
|2016-02-17 18:41:00||Lan||Information||0||A vulnerability has been detected on a computer. A MANDATORY upgrade must be done, affecting approximately 5 minutes in some clients. This service has been scheduled at 11:45 pm.|
|2016-02-11 11:45:00||Wan||Alert||16||A mandatory maintenance window was opened by one of the internet providers. |
No client, instance or service was affected.
There is no impact on the Committed Service Level (SLA).
|2016-02-10 21:06:00||Wan||Partial Error||2||There was a minor failure in one of the firewalls, we proceeded to the correction.|
======== Update 10/2/2016 9:12 AM ==============
The system is working correctly.
|2016-02-03 12:01:00||Wan||Alert||180||We detect that the internet is very slow, we also find that certain clients are disconnected. The connections that come from the carrier (third) bestel is the one that presents the most failures. It has been reported with the carrier that presents the fault and it has been solved.|
|2014-08-25 13:01:00||Wan||Error||420||Adjustment to SLA It is the SLA from August 2014 to December 31, 2015 Before the current automatic measurement system of the SLA.|
What happens if we fail in the level of service?
In OnCloud we strive every day to exceed our commitment of 99.828% (13.87 hours of failures per year) but if this is not met we will offer our customers a discount of 5% of their monthly billing. For more information about this guarantee review your OnCloud agreement with your account executive.