USSUP-615 Email2sms SMTP processing delays

Incident Report for 2sms LLC

Postmortem

Start Date: 12/19/2024 8:27am (EST) / 19th December 2024 13:27 (UTC)

 

Finish Date: 12/19/2024 9:54am (EST) / 19th December 2024 14:54 (UTC)

 

Description:

Processing delays on SMTP and Email2sms inbox polling for some customers

 

Impacted Services:

  1. SMTP

Impacted Customers:

  1. Some customers using SMTP

Cause:

During routine monthly infrastructure maintenance, a failback event of SMTP processors partially failed. This was caused by connection resets and a dependency startup failure. The dependency took an abnormal amount of time to start up and failed in a way that was not detected by the SMTP processor. This resulted in email processing to fail for some customers.

 

Detection:

A relatively small number of SMTP processors were affected, which caused the issue to remain below the threshold of detection by our internal monitoring systems. These systems are designed to identify disruptions or anomalies based on scale and impact, and in this case, the limited scope of the problem prevented any automated alerts or warnings from being triggered. Consequently, 2sms became aware of the situation only when a customer reported the issue, prompting an immediate investigation to understand and address the root cause.

Corrective Actions:

Within 20 minutes of a customer reporting the issue, the issue was tracked to affected processors which were restarted. Positive traffic flow then resumed.

 

Preventative actions:

We have made changes to the routine maintenance processes to account for and extended dependency startup delay. We will be improving our monitoring services and we will be implementing alternative processor technologies for SMTP traffic to avoid a recurrence of this dependency issue.

Posted Dec 23, 2024 - 11:13 UTC

Resolved

We saw processing issues for Email2sms SMTP traffic for some customers. The issue is now resolved
Posted Dec 19, 2024 - 13:30 UTC