Degradation Report - Service Interruption for api.veryfi.com
Started 18 Mar at 12:51pm UTC, resolved 18 Mar at 04:02pm UTC.
At 16:02:00 UTC, we were able to resolve the issue by killing all malfunctioning instances and implementing updated configurations to prevent similar errors from occurring in the future.
- Date: 2023-03-18
- Start Time: 12:50:00 UTC
- End Time: 16:02:00 UTC
- Impact: Random HTTP 500 server responses
On March 18, 2023, at 12:50:00 UTC, our monitoring systems detected some newly started Veryfi instances that were malfunctioning. This resulted in random HTTP 500 server responses being served to our API clients. We understand that this may have caused disruptions in the normal use of our services.
If you have implemented a retry mechanism when Veryfi API fails, it is likely that most of your requests were successfully processed on the second attempt during this period of service degradation.
Upon identifying the issue, our engineering team immediately started investigating the root cause and working on a fix. At 16:02:00 UTC, we were able to resolve the issue by killing all malfunctioning instances and implementing updated configurations to prevent similar errors from occurring in the future.
- To ensure that such an incident does not happen again, we have taken the following steps:
- Implemented a more stringent monitoring system to detect and address issues in instances more promptly.
- Reviewed and enhanced our instance deployment process to ensure proper configuration before instances go live.
- Conducted a thorough post-mortem analysis to identify any additional areas of improvement in our infrastructure and processes.