UI Outage
Incident Report for CloudWisdom
Postmortem

On Sunday at 12:00PM UTC (8:00AM Eastern) we had an internal SSL certificate expire. This certificate is only used for internal load balancers (not our data ingest, API, or UI endpoints) causing internal communication issues. Our API service was not able to communicate with some of our backend services resulting in an unusable user interface.

We discovered the issue at 9:40AM Eastern and updated the certificate, restoring access to the UI at 10:00AM Eastern.

We recognize that behind our firewall API endpoints which are not public do not require SSL and therefore have decided to remove them in this specific case, which prevents us from being exposed to certificate expirations. For all public-facing endpoints we will continue to force the use of SSL and will monitor them more closely for expiration.

Posted Sep 22, 2020 - 10:42 EDT

Resolved
We've resolved the incident. We'll add a public statement to this incident within a business day.
Posted Sep 20, 2020 - 10:04 EDT
Monitoring
We've implemented a fix. The UI is operational again. We'll hold in monitoring as we check other systems.
Posted Sep 20, 2020 - 10:00 EDT
Identified
We're having an internal issue with HTTP connections which has resulting in a UI outage. We're working to fix the issue.
Posted Sep 20, 2020 - 09:53 EDT
This incident affected: Data Processing and Data Access.