Building something on the top of 3rd party cloud-hosted services has become a norm, mostly the API services. As the dependency raises the pros and cons that come with them also grows. No doubt the wide adoption of API-based interfaces works best for developers. It allows the engineers to build cool stuffs, reduces the efforts during the maintenance. Well here are few points that we can't ignore now :-
So f we will talk about the outages that had occurred throughout the year, the visibility of the status was not that much clear(the Atlassian outage which went on for days without any details provided). In my blog I have posted few breakdowns of the outages but that doesn't work in real-time with the folks who were on-call during the outages.
These providers are hesitant to show the proper status to the customers because it shows their fault which creates a bad image in front of the customers. But in the end it makes the troubleshooting team's job really difficult because they will be unaware of the issue that happened specifically on whose side.
There could be another reason behind this is lack of proper structuring the incident handling process. Few of the companies do it really well(Atlassian came out clean with proper incident report in an article, google always posts the incident reports, not in details but they at least admit it). Whereas some providers completely hide the fact without any proper report.
This kind of invisibility of the incident which should be clearly mentioned on the status page of the provider, as they are not performing this properly so customers have to go to Twitter or DownDetector. This makes the customers loose their faith in the providers. The dashboards or the status pages are reducing their own credibility.
There should be some proper communication medium using which both vendor and customer can come to specific point where they know what could be the possible blocker. Instead of knowing the reason from other places like twitter and few more sites, it's better if the issue comes from the vendor itself. Troubleshooting a particular issue, ruling out the unnecessary reasons eases out the process.
Major outages are always documented and visible but what about the minor outages because they also put some impact on the business. So if the vendors post their disaster recovery plans, decision they take throughout the journey .Along with this it would be great if the performance data of the 3rd party is also visible on the dashboard.
So if all these things are being taken care of by any of the vendors then customers will reach out to them and as part of the competition other 3rd party services also have implement the same. Obviously all these transparency and visibility kind of things depends on trust between the vendor and the customer which can create a whole new dynamic.
thanks for reading up to here😊
source - Metrist Article